12/05/2024 - 18/05/2024

12/05/2024 23:14

To incorporate the ethernet splitter to support the multiple crate system once we set it up, I simply plugged in the 1gbE connection into the ethernet splitter, then ran another wire from the ethernet splitter to the MCH. I was still able to ping the MCH with no edits to the network scripts below:
/etc/sysconfig/network-script/ifcfg-enp5s0
(1GbE)

#
# Connect to MCH
#
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.1.1
NETMASK=255.255.255.128
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp5s0
DEVICE=enp5s0
ONBOOT=yes

#
# Connect to MCH
#
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.1.1
NETMASK=255.255.255.128
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp5s0
DEVICE=enp5s0
ONBOOT=yes

However, we need to support a second crate that will be on the 192.186.{crate #}.xxx network, so I changed the netmask to accept any value from the 3rd octet of the IP address:

/etc/sysconfig/network-script/ifcfg-enp5s0
(1GbE)

#
# Connect to MCH
#
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.1.1
NETMASK=255.255.255.128
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp5s0
DEVICE=enp5s0
ONBOOT=yes

#
# Connect to MCH
#
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.1.1
NETMASK=255.255.255.128
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp5s0
DEVICE=enp5s0
ONBOOT=yes

13/05/2024 00:26

Poll event seems to lock the gpu thread to poll if data is available.

INT poll_event(INT source __attribute__((unused)), INT count, BOOL test)
{
  // fake calibration
  if (test) {
    for (int i = 0; i < count; i++) {
      usleep(1);
    }
    return 0;
  }

  INT retval = 0; 
  BOOL data_avail = FALSE;          // true if data is available for readout

  // Check GPU buffer
  pthread_mutex_lock( &mutex_GPU_general );
  if (GPUfillnumber > Midasfillnumber)
  {
    data_avail = TRUE;
  }
  if (GPUfillnumber < Midasfillnumber && GPUfillnumber!=0) // this is for wrapping over the largest unsigned long, which is not very probable if the run is short
  {
    unsigned long buffer_filled = 0xffffffffffffffff - (Midasfillnumber - GPUfillnumber) +1 ;
    if (buffer_filled < 0xffffffffffffffff / 2)
    {
      data_avail = TRUE;
    }
  }
  pthread_mutex_unlock( &mutex_GPU_general );

//  if (run_state == STATE_RUNNING) {
    if (data_avail) {
      retval = 1;
    }
//  }
  return retval; 
} // poll_event

INT poll_event(INT source __attribute__((unused)), INT count, BOOL test)
{
  // fake calibration
  if (test) {
    for (int i = 0; i < count; i++) {
      usleep(1);
    }
    return 0;
  }

  INT retval = 0; 
  BOOL data_avail = FALSE;          // true if data is available for readout

  // Check GPU buffer
  pthread_mutex_lock( &mutex_GPU_general );
  if (GPUfillnumber > Midasfillnumber)
  {
    data_avail = TRUE;
  }
  if (GPUfillnumber < Midasfillnumber && GPUfillnumber!=0) // this is for wrapping over the largest unsigned long, which is not very probable if the run is short
  {
    unsigned long buffer_filled = 0xffffffffffffffff - (Midasfillnumber - GPUfillnumber) +1 ;
    if (buffer_filled < 0xffffffffffffffff / 2)
    {
      data_avail = TRUE;
    }
  }
  pthread_mutex_unlock( &mutex_GPU_general );

//  if (run_state == STATE_RUNNING) {
    if (data_avail) {
      retval = 1;
    }
//  }
  return retval; 
} // poll_event

Though, this shouldn't be problematic because this check is short and isn't done until it can obtain the lock anyways.

13/05/2024 00:59

I've tracked down where all the timestamps are made:

TCP proc unlocked, tcp_thread.cxx:624

// get time of start of read / unpack AMC13 event
status = gettimeofday( &tstart, NULL);
header[1] = tstart.tv_sec;  // fill header time info in header
header[2] = tstart.tv_usec; // fill header time info in header

// get time of start of read / unpack AMC13 event
status = gettimeofday( &tstart, NULL);
header[1] = tstart.tv_sec;  // fill header time info in header
header[2] = tstart.tv_usec; // fill header time info in header

got TCP header word, tcp_thread.cxx:1041

// record time got header word
gettimeofday( &theader, NULL);
header[3] = theader.tv_sec; // fill header time info in header
header[4] = theader.tv_usec; // fill header time info in header

// record time got header word
gettimeofday( &theader, NULL);
header[3] = theader.tv_sec; // fill header time info in header
header[4] = theader.tv_usec; // fill header time info in header

got TCP header word 2, tcp_thread.cxx:670

// get time done read / unpack of AMC13 event 
status = gettimeofday( &tdata, NULL);
header[5] = tdata.tv_sec; // fill data time info in header
header[6] = tdata.tv_usec; // fill data time info in header

// get time done read / unpack of AMC13 event 
status = gettimeofday( &tdata, NULL);
header[5] = tdata.tv_sec; // fill data time info in header
header[6] = tdata.tv_usec; // fill data time info in header

GPU proc unlocked, gpu_thread.cpp:557

gettimeofday( &tstart, NULL);

gettimeofday( &tstart, NULL);

about 40 lines later...

//Add the GPU processing start time stamp
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[7] = tstart.tv_sec; 
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[8] = tstart.tv_usec;

//Add the GPU processing start time stamp
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[7] = tstart.tv_sec; 
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[8] = tstart.tv_usec;

GPU copy done, gpu_thread.cpp:701

// get GPU copy time for GPU thread
gettimeofday( &tcopy, NULL);
dbprintf("%s(%d): duration of start to copy, fdt = %e us \n", __func__, __LINE__, toddiff( &tstart, &tcopy) );
trigger_info.time_gputhread_copytogpu_done_s = tcopy.tv_sec;
trigger_info.time_gputhread_copytogpu_done_us = tcopy.tv_usec;     
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[9] = tcopy.tv_sec; // fill copy to GPU time info in header
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[10] = tcopy.tv_usec; // fill copy to GPU time info in header

// get GPU copy time for GPU thread
gettimeofday( &tcopy, NULL);
dbprintf("%s(%d): duration of start to copy, fdt = %e us \n", __func__, __LINE__, toddiff( &tstart, &tcopy) );
trigger_info.time_gputhread_copytogpu_done_s = tcopy.tv_sec;
trigger_info.time_gputhread_copytogpu_done_us = tcopy.tv_usec;     
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[9] = tcopy.tv_sec; // fill copy to GPU time info in header
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[10] = tcopy.tv_usec; // fill copy to GPU time info in header

GPU proc done, gpu_thread.cpp:765

// get GPU run time for GPU thread
gettimeofday( &tprocess, NULL);
dbprintf("%s(%d): duration of copy to process, fdt = %e us \n", __func__, __LINE__, toddiff( &tprocess, &tcopy) );
trigger_info.time_gputhread_finished_s = tprocess.tv_sec;
trigger_info.time_gputhread_finished_us = tprocess.tv_usec;     
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[11] = tprocess.tv_sec;
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[12] = tprocess.tv_usec;

// get GPU run time for GPU thread
gettimeofday( &tprocess, NULL);
dbprintf("%s(%d): duration of copy to process, fdt = %e us \n", __func__, __LINE__, toddiff( &tprocess, &tcopy) );
trigger_info.time_gputhread_finished_s = tprocess.tv_sec;
trigger_info.time_gputhread_finished_us = tprocess.tv_usec;     
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[11] = tprocess.tv_sec;
GPU_Data_Buffer[GPUbufferindex].gpu_data_header[12] = tprocess.tv_usec;

MFE proc unlocked, frontend.cpp:2714

  status = gettimeofday( &t_lock_data, NULL);
  trigger_info.time_slave_lock_dataready_s  = t_lock_data.tv_sec;
  trigger_info.time_slave_lock_dataready_us = t_lock_data.tv_usec;

  // store timing information and current TCPfillnumber, GPUfillnumber in header databank
  GPUDATA->gpu_data_header[13] = t_lock_data.tv_sec;
  GPUDATA->gpu_data_header[14] = t_lock_data.tv_usec;

  status = gettimeofday( &t_lock_data, NULL);
  trigger_info.time_slave_lock_dataready_s  = t_lock_data.tv_sec;
  trigger_info.time_slave_lock_dataready_us = t_lock_data.tv_usec;

  // store timing information and current TCPfillnumber, GPUfillnumber in header databank
  GPUDATA->gpu_data_header[13] = t_lock_data.tv_sec;
  GPUDATA->gpu_data_header[14] = t_lock_data.tv_usec;

MFE banks made, frontend.cpp:3288

  status = gettimeofday( &t_got_data, NULL);
  trigger_info.time_slave_got_data_s  = t_got_data.tv_sec;
  trigger_info.time_slave_got_data_us = t_got_data.tv_usec;

  // make more header / timing data
  // array elements 17, 18 reserced for compression timing data
  GPUDATA->gpu_data_header[15] = t_got_data.tv_sec;
  GPUDATA->gpu_data_header[16] = t_got_data.tv_usec;

  status = gettimeofday( &t_got_data, NULL);
  trigger_info.time_slave_got_data_s  = t_got_data.tv_sec;
  trigger_info.time_slave_got_data_us = t_got_data.tv_usec;

  // make more header / timing data
  // array elements 17, 18 reserced for compression timing data
  GPUDATA->gpu_data_header[15] = t_got_data.tv_sec;
  GPUDATA->gpu_data_header[16] = t_got_data.tv_usec;

lossless compression, frontend.cpp:3463

  status = gettimeofday( &t_done_compression, NULL);
  perf_data[17] = t_done_compression.tv_sec;
  perf_data[18] = t_done_compression.tv_usec;

  status = gettimeofday( &t_done_compression, NULL);
  perf_data[17] = t_done_compression.tv_sec;
  perf_data[18] = t_done_compression.tv_usec;

13/05/2024 01:03

What happens in between each time step?

got TCP header word - TCP proc unlocked

Lock TCP thread, call part of read and unpack

    pthread_mutex_lock( &mutex_TCP_buf[bufIndex] );  

    // function reads / unpacks the AMC13 block structure
    gettimeofday( &tbeginread, NULL);
    databytes = readAndUnpack( bufIndex );

    pthread_mutex_lock( &mutex_TCP_buf[bufIndex] );  

    // function reads / unpacks the AMC13 block structure
    gettimeofday( &tbeginread, NULL);
    databytes = readAndUnpack( bufIndex );

Declare local read and unpack variables, read first word:

int readAndUnpack(int bufIndex){

//#ifdef DEBUG
  unsigned int EventIndex;  // AMC13 reported event number
  unsigned int OverallSize; // event size in AMC13 header
//#endif
  int iAMC, nAMC;  // AMC13 reported number of AMC modules 

//#ifdef DEBUG
  int local_headerbytes = TCPheadersize;
//#endif

  int block_status = 0;

  int retval = ReadXBytes( clientsockfd,  sizeof(uint64_t), (void*)( header ),block_status );
//  printf("Read Header: %d vs %d",retval, sizeof(uint64_t));

  // get overall CDF header word
  if (retval < int(sizeof(uint64_t)))
  {
    if ( retval < 0 ) 
    {                                                                                    
      cm_msg(MERROR, __FILE__, "Cannot read header from socket");                                 
      return -1;                                                                  
    }else if (retval == 0)
    {
      if (block_status == 1 )
      {
    return 0;
      }else{
    cm_msg(MERROR, __FILE__, "Cannot read header from socket");                                 
    return -1;                                                                  
      }
    }else{
    cm_msg(MERROR, __FILE__, "Cannot read header from socket");                                 
      return -1;
    }
  }

  // get event number from header bank
//#ifdef DEBUG
  EventIndex = getEventIndex( be64toh( *header ) );
//#endif

  // pointer location to AMC13 unpacking info in amc13info data array
  offsetamc13info = amc13info;
  // write CDF header word in the amc13info array
  *offsetamc13info = *header;
  dbprintf("%s(%d): read header, header size [bytes] %d, header[0] 0x%016lX, BODdelimiter 0x%016lX, BODmask 0x%016lx Event number %i\n", 
         __func__, __LINE__, local_headerbytes, *offsetamc13info, BODdelimiter, BODmask, EventIndex );
  offsetamc13info++;

  // record time got header word
  gettimeofday( &theader, NULL);
  header[3] = theader.tv_sec; // fill header time info in header
  header[4] = theader.tv_usec; // fill header time info in header

int readAndUnpack(int bufIndex){

//#ifdef DEBUG
  unsigned int EventIndex;  // AMC13 reported event number
  unsigned int OverallSize; // event size in AMC13 header
//#endif
  int iAMC, nAMC;  // AMC13 reported number of AMC modules 

//#ifdef DEBUG
  int local_headerbytes = TCPheadersize;
//#endif

  int block_status = 0;

  int retval = ReadXBytes( clientsockfd,  sizeof(uint64_t), (void*)( header ),block_status );
//  printf("Read Header: %d vs %d",retval, sizeof(uint64_t));

  // get overall CDF header word
  if (retval < int(sizeof(uint64_t)))
  {
    if ( retval < 0 ) 
    {                                                                                    
      cm_msg(MERROR, __FILE__, "Cannot read header from socket");                                 
      return -1;                                                                  
    }else if (retval == 0)
    {
      if (block_status == 1 )
      {
    return 0;
      }else{
    cm_msg(MERROR, __FILE__, "Cannot read header from socket");                                 
    return -1;                                                                  
      }
    }else{
    cm_msg(MERROR, __FILE__, "Cannot read header from socket");                                 
      return -1;
    }
  }

  // get event number from header bank
//#ifdef DEBUG
  EventIndex = getEventIndex( be64toh( *header ) );
//#endif

  // pointer location to AMC13 unpacking info in amc13info data array
  offsetamc13info = amc13info;
  // write CDF header word in the amc13info array
  *offsetamc13info = *header;
  dbprintf("%s(%d): read header, header size [bytes] %d, header[0] 0x%016lX, BODdelimiter 0x%016lX, BODmask 0x%016lx Event number %i\n", 
         __func__, __LINE__, local_headerbytes, *offsetamc13info, BODdelimiter, BODmask, EventIndex );
  offsetamc13info++;

  // record time got header word
  gettimeofday( &theader, NULL);
  header[3] = theader.tv_sec; // fill header time info in header
  header[4] = theader.tv_usec; // fill header time info in header

got TCP header word 2 - got TCP header word

The rest of read and unpack happens:

dbprintf("%s(%d): duration from AVAIL lock to fill header bank, buffer[%d], fill %d, duration %e us \n", 
         __func__, __LINE__, bufIndex, TCPfillnumber, toddiff( &theader, &tstart) );

  // byte / block counters for AMC modules x AMC blocks readoout structure
  int blockdatabytes = 0; // individual AMC module bytes per AMC13 block
  int totaldatabytes = 0; // running total of all AMC modules data bytes 
  int blockcount = 0;     // AMC13  block counters

  // data offsets for unpacking data buffer structure of AMCs x blocks
  unsigned int dataoffset = 0, datablockoffset[12], dataAMCoffset[12];
  memset( datablockoffset, 0, sizeof(datablockoffset) ); // block offset of particular AMC modules data
  memset( dataAMCoffset, 0, sizeof(dataAMCoffset) ); // overall offset of particular AMC modules data

  bool moredata = 1; // more data is true of more blocks are available
  while ( moredata ){  // loops over AMC data blocks 

    // read single 64-bit AMC13 block header word
    //Try reading 1 times before giving up
    int read_fail = 0;
    while (read_fail<1)
    {
      retval  = ReadXBytes( clientsockfd, sizeof(uint64_t), (void*)( offsetamc13info ) ,block_status);
      if (retval>0)
      {
    break;
      }
      usleep(100000);
      read_fail++;
    }
    if (read_fail>=1)
    {
      cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d, for %d times", clientsockfd,retval,sizeof(uint64_t),read_fail);  
      cm_msg(MERROR,__FILE__, "read header, header size [bytes] %d, header[0] 0x%016lX, BODdelimiter 0x%016lX, BODmask 0x%016lx Event number %i",local_headerbytes, *offsetamc13info, BODdelimiter, BODmask, EventIndex);
    }
    if ( retval < int(sizeof(uint64_t)) ) 
    {                                                                                    
      cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,sizeof(uint64_t));  
      cm_msg(MERROR,__FILE__, "read header, header size [bytes] %d, header[0] 0x%016lX, BODdelimiter 0x%016lX, BODmask 0x%016lx Event number %i",local_headerbytes, *offsetamc13info, BODdelimiter, BODmask, EventIndex);
      return -1;                                                                  
    }

    // get the number of enabled AMCs
    nAMC = getAMCNum( be64toh( *offsetamc13info ) );
    offsetamc13info++;
    dbprintf("%s(%d): reading AMC general header word 0x%016lX, nAMC decoded %i\n", 
         __func__, __LINE__, *offsetamc13info, getAMCNum( be64toh( *offsetamc13info ) ) );

    // WARN if mismatch between ODB and AMC13 headers / trailers for number of active modules for first block
    if ( ( blockcount == 0 ) && ( nAMC != NRiderModuleEnabled ) ) {
      cm_msg(MERROR, __FILE__, "WARNING! mismatch between ODB (%i) and AMC13 headers (%i) for number of AMC modules", NRiderModuleEnabled, nAMC);
      dbprintf("%s(%d): WARNING! mis-match between ODB (%i) and AMC13 headers (%i) for number of AMC modules\n", __func__, __LINE__,  NRiderModuleEnabled, nAMC);
    }

    // read 64-bit AMC module header words - one per AMC
    retval  = ReadXBytes( clientsockfd, nAMC*sizeof(uint64_t), (void*)( offsetamc13info) ,block_status);
    if ( retval < int(nAMC*sizeof(uint64_t)) ) 
      {                                                                                    
    cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,nAMC*sizeof(uint64_t));  
    return -1;                                                                  
      }

    // WARN if mismatch between ODB and AMC13 headers / trailers for AMC slot number
    for (iAMC = 0; iAMC < nAMC; iAMC++){
      if ( !amc13_rider_odb[amc_header_info[iAMC].AMCSlotNum-1].board.rider_enabled ) {
    //cm_msg(MERROR, __FILE__, "WARNING! AMC slot %i not enabled in ODB", amc_header_info[iAMC].AMCSlotNum);
    dbprintf("%s(%d): WARNING! amc_header_info[iAMC].AMCSlot %i\n", __func__, __LINE__, amc_header_info[iAMC].AMCSlotNum);
      }
    }

    // decode AMC header words - get continuation bits, event / block size, AMC slot number
    // set moredata = 1 if more blocks are following this block
    moredata = 0;
    for (iAMC = 0; iAMC < nAMC; iAMC++){
      if ( decodeAMCHeader( iAMC, be64toh( *( offsetamc13info ) ) ) != 0 )
    {
      printf("decodeAMCHeader() failed!");
    }
      offsetamc13info++;
      if (amc_header_info[iAMC].AMCMoreBit) moredata = 1;

     dbprintf("%s(%d): AMC index %d, AMC Slot number %d, AMCMoreBit %d, more data %d, AMCEventSize 0x%08x\n", 
          __func__, __LINE__, iAMC, amc_header_info[iAMC].AMCSlotNum, amc_header_info[iAMC].AMCMoreBit,  moredata, amc_header_info[iAMC].AMCEventSize );
    }

    // calculate AMC data offsets dataAMCoffset[amc_header_info[iAMC].AMCSlotNum-1] from total event sizes in S=0 word AMC header word  
    // (i.e. for either M=1,S=0 with continuation blocks or M=0,S=0 with only one block)
    // This calculation is performed once per fill / event and hanfles different total data sizes, 
    // i.e. amc_header_info[iAMC].AMCEventSize, from different amcmodules
    if ( !amc_header_info[0].AMCSegBit ) {
      int AMCoffsetbytes = 0;      
      for (iAMC = 0; iAMC < nAMC; iAMC++){
    dataAMCoffset[amc_header_info[iAMC].AMCSlotNum-1] = AMCoffsetbytes / sizeof(uint64_t);
    dbprintf("%s(%d): blockcount %d, AMC index %d, calculated AMC total data offset 0x%08x\n", 
         __func__, __LINE__, blockcount, iAMC, dataAMCoffset[amc_header_info[iAMC].AMCSlotNum-1]); 
    AMCoffsetbytes += sizeof(uint64_t)*amc_header_info[iAMC].AMCEventSize;
      }
    }

    // read AMC data block
    for (iAMC = 0; iAMC < nAMC; iAMC++){

      // calculate the data bytes - blockdatabytes - to read for each AMC module with index iAMC 
      // bits determine if first block, intermediate block, last block or single block
      if ( amc_header_info[iAMC].AMCMoreBit && (!amc_header_info[iAMC].AMCSegBit) )
    {
      blockdatabytes = 32768;
      dbprintf("M=1,S=0 first block in segment, set size to 0x%08x bytes (odb 0x%08x)\n", 
           blockdatabytes, amc13_amc13_odb.amc_block_size);
    }
      if ( amc_header_info[iAMC].AMCMoreBit && amc_header_info[iAMC].AMCSegBit )
    {
      dbprintf("M=1,S=1 intermediate block in segment, set size from amc header word\n");
      blockdatabytes = sizeof(uint64_t)*amc_header_info[iAMC].AMCEventSize;
    }
      if ( (!amc_header_info[iAMC].AMCMoreBit) && amc_header_info[iAMC].AMCSegBit )
    {
      dbprintf("M=0,S=1 last block in segment, set size from amc header word\n");
      blockdatabytes = sizeof(uint64_t)*amc_header_info[iAMC].AMCEventSize;	  
    }
      if ( (!amc_header_info[iAMC].AMCMoreBit) && (!amc_header_info[iAMC].AMCSegBit) )
    {
      dbprintf("M=0,S=0 only block in segment, set size from amc header word\n");
      blockdatabytes = sizeof(uint64_t)*amc_header_info[iAMC].AMCEventSize;
    }

      // calculated the location to put the data from block structure in AMC13 event
      dataoffset = dataAMCoffset[amc_header_info[iAMC].AMCSlotNum-1] + datablockoffset[amc_header_info[iAMC].AMCSlotNum-1];
      dbprintf("%s(%d): blockcount %d, iAMC %d, calculated AMC+Block data offset 0x%08x block data bytes 0x%08x data bytes total 0x%08x\n", 
           __func__, __LINE__, blockcount, iAMC, dataoffset, blockdatabytes, totaldatabytes); 

      // read the data block for each AMC module in array tcp_buf_gl[bufIndex]
      retval  = ReadXBytes( clientsockfd, blockdatabytes, (void*)( tcp_buf_gl[bufIndex] + dataoffset ) ,block_status);
    if ( retval < blockdatabytes) 
    {                                                                                    
      cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,blockdatabytes);  
      return -1;                                                                  
    }
      dbprintf("%s(%d): done reading AMC block %i bytes %i, dataoffset %d, (tcp_buf_gl[bufIndex] + dataoffset ) %p, data[0] 0x%16lx data[1] 0x%16lx\n", 
           __func__, __LINE__, blockcount, blockdatabytes, dataoffset, ( tcp_buf_gl[bufIndex] + dataoffset ), 
           *( tcp_buf_gl[bufIndex] + dataoffset ), *( tcp_buf_gl[bufIndex] + dataoffset + 1 ) ); 

      //dataoffset += blockdatabytes/sizeof(uint64_t); // redundant so removed?
      datablockoffset[amc_header_info[iAMC].AMCSlotNum-1] += blockdatabytes/sizeof(uint64_t); // datablockoffset[i] is individual payload readout from ith AMC module
      totaldatabytes += blockdatabytes; // totaldatabytes is total payload readout from all AMC modules
      dbprintf("%s(%d): end of read loop for amc %i\n",__func__, __LINE__,iAMC);
    }

    // read single 64-bit AMC13 block trailer word
    retval = ReadXBytes( clientsockfd, sizeof(uint64_t), (void*)( offsetamc13info ) ,block_status);
    if ( retval < int(sizeof(uint64_t))) 
      {                                                                                    
    cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,sizeof(uint64_t));  
    return -1;                                                                  
      }
    dbprintf("%s(%d): done reading AMC block %i, trailer word *tmp 0x%08lx\n", 
         __func__, __LINE__, blockcount, *offsetamc13info); 

    offsetamc13info++;
    blockcount++;
  }
  dbprintf("%s(%d): finished data read / unpack, databytes total 0x%08x block count %i\n", 
       __func__, __LINE__, totaldatabytes, blockcount); 

  // get CDF trailer word
  retval = ReadXBytes( clientsockfd, tailbytes, (void*)(tail) ,block_status);
  if ( retval < int(tailbytes)) 
    {                                                                                    
      cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,tailbytes);  
      return -1;                                                                  
    }

#ifdef DEBUG
  OverallSize = getOverallSize( be64toh(tail[0]) );
#endif

  dbprintf("%s(%d): read trailer, trailer size [bytes] %d, tail[0] 0x%016lX, EODdelimiter 0x%016lX, EODmask 0x%016lX, Overall Size %i\n", 
       __func__, __LINE__, tailbytes, be64toh(tail[0]), EODdelimiter, EODmask, OverallSize);

#if 0 // turn on/off CPU-based byte-reordering in 8-byte AMC13 words 

  // re-order data from network / big-endian to little-endian
  struct timeval tbeforeReorderBytes, tafterReorderBytes;
  gettimeofday( &tbeforeReorderBytes, NULL);

  int iReorderBytes, nReorderBytes = totaldatabytes / sizeof(uint64_t);
  for (iReorderBytes = 0; iReorderBytes < nReorderBytes; iReorderBytes++){
    tcp_buf_gl[bufIndex][iReorderBytes] = be64toh( tcp_buf_gl[bufIndex][iReorderBytes] );
  }

  gettimeofday( &tafterReorderBytes, NULL);
  dbprintf("%s(%d): duration of byte re-ordering, buffer[%d], fill %d, duration %e us \n", 
     __func__, __LINE__, bufIndex, TCPfillnumber, toddiff( &tafterReorderBytes, &tbeforeReorderBytes) );
#endif 

  return totaldatabytes;
}

dbprintf("%s(%d): duration from AVAIL lock to fill header bank, buffer[%d], fill %d, duration %e us \n", 
         __func__, __LINE__, bufIndex, TCPfillnumber, toddiff( &theader, &tstart) );

  // byte / block counters for AMC modules x AMC blocks readoout structure
  int blockdatabytes = 0; // individual AMC module bytes per AMC13 block
  int totaldatabytes = 0; // running total of all AMC modules data bytes 
  int blockcount = 0;     // AMC13  block counters

  // data offsets for unpacking data buffer structure of AMCs x blocks
  unsigned int dataoffset = 0, datablockoffset[12], dataAMCoffset[12];
  memset( datablockoffset, 0, sizeof(datablockoffset) ); // block offset of particular AMC modules data
  memset( dataAMCoffset, 0, sizeof(dataAMCoffset) ); // overall offset of particular AMC modules data

  bool moredata = 1; // more data is true of more blocks are available
  while ( moredata ){  // loops over AMC data blocks 

    // read single 64-bit AMC13 block header word
    //Try reading 1 times before giving up
    int read_fail = 0;
    while (read_fail<1)
    {
      retval  = ReadXBytes( clientsockfd, sizeof(uint64_t), (void*)( offsetamc13info ) ,block_status);
      if (retval>0)
      {
    break;
      }
      usleep(100000);
      read_fail++;
    }
    if (read_fail>=1)
    {
      cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d, for %d times", clientsockfd,retval,sizeof(uint64_t),read_fail);  
      cm_msg(MERROR,__FILE__, "read header, header size [bytes] %d, header[0] 0x%016lX, BODdelimiter 0x%016lX, BODmask 0x%016lx Event number %i",local_headerbytes, *offsetamc13info, BODdelimiter, BODmask, EventIndex);
    }
    if ( retval < int(sizeof(uint64_t)) ) 
    {                                                                                    
      cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,sizeof(uint64_t));  
      cm_msg(MERROR,__FILE__, "read header, header size [bytes] %d, header[0] 0x%016lX, BODdelimiter 0x%016lX, BODmask 0x%016lx Event number %i",local_headerbytes, *offsetamc13info, BODdelimiter, BODmask, EventIndex);
      return -1;                                                                  
    }

    // get the number of enabled AMCs
    nAMC = getAMCNum( be64toh( *offsetamc13info ) );
    offsetamc13info++;
    dbprintf("%s(%d): reading AMC general header word 0x%016lX, nAMC decoded %i\n", 
         __func__, __LINE__, *offsetamc13info, getAMCNum( be64toh( *offsetamc13info ) ) );

    // WARN if mismatch between ODB and AMC13 headers / trailers for number of active modules for first block
    if ( ( blockcount == 0 ) && ( nAMC != NRiderModuleEnabled ) ) {
      cm_msg(MERROR, __FILE__, "WARNING! mismatch between ODB (%i) and AMC13 headers (%i) for number of AMC modules", NRiderModuleEnabled, nAMC);
      dbprintf("%s(%d): WARNING! mis-match between ODB (%i) and AMC13 headers (%i) for number of AMC modules\n", __func__, __LINE__,  NRiderModuleEnabled, nAMC);
    }

    // read 64-bit AMC module header words - one per AMC
    retval  = ReadXBytes( clientsockfd, nAMC*sizeof(uint64_t), (void*)( offsetamc13info) ,block_status);
    if ( retval < int(nAMC*sizeof(uint64_t)) ) 
      {                                                                                    
    cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,nAMC*sizeof(uint64_t));  
    return -1;                                                                  
      }

    // WARN if mismatch between ODB and AMC13 headers / trailers for AMC slot number
    for (iAMC = 0; iAMC < nAMC; iAMC++){
      if ( !amc13_rider_odb[amc_header_info[iAMC].AMCSlotNum-1].board.rider_enabled ) {
    //cm_msg(MERROR, __FILE__, "WARNING! AMC slot %i not enabled in ODB", amc_header_info[iAMC].AMCSlotNum);
    dbprintf("%s(%d): WARNING! amc_header_info[iAMC].AMCSlot %i\n", __func__, __LINE__, amc_header_info[iAMC].AMCSlotNum);
      }
    }

    // decode AMC header words - get continuation bits, event / block size, AMC slot number
    // set moredata = 1 if more blocks are following this block
    moredata = 0;
    for (iAMC = 0; iAMC < nAMC; iAMC++){
      if ( decodeAMCHeader( iAMC, be64toh( *( offsetamc13info ) ) ) != 0 )
    {
      printf("decodeAMCHeader() failed!");
    }
      offsetamc13info++;
      if (amc_header_info[iAMC].AMCMoreBit) moredata = 1;

     dbprintf("%s(%d): AMC index %d, AMC Slot number %d, AMCMoreBit %d, more data %d, AMCEventSize 0x%08x\n", 
          __func__, __LINE__, iAMC, amc_header_info[iAMC].AMCSlotNum, amc_header_info[iAMC].AMCMoreBit,  moredata, amc_header_info[iAMC].AMCEventSize );
    }

    // calculate AMC data offsets dataAMCoffset[amc_header_info[iAMC].AMCSlotNum-1] from total event sizes in S=0 word AMC header word  
    // (i.e. for either M=1,S=0 with continuation blocks or M=0,S=0 with only one block)
    // This calculation is performed once per fill / event and hanfles different total data sizes, 
    // i.e. amc_header_info[iAMC].AMCEventSize, from different amcmodules
    if ( !amc_header_info[0].AMCSegBit ) {
      int AMCoffsetbytes = 0;      
      for (iAMC = 0; iAMC < nAMC; iAMC++){
    dataAMCoffset[amc_header_info[iAMC].AMCSlotNum-1] = AMCoffsetbytes / sizeof(uint64_t);
    dbprintf("%s(%d): blockcount %d, AMC index %d, calculated AMC total data offset 0x%08x\n", 
         __func__, __LINE__, blockcount, iAMC, dataAMCoffset[amc_header_info[iAMC].AMCSlotNum-1]); 
    AMCoffsetbytes += sizeof(uint64_t)*amc_header_info[iAMC].AMCEventSize;
      }
    }

    // read AMC data block
    for (iAMC = 0; iAMC < nAMC; iAMC++){

      // calculate the data bytes - blockdatabytes - to read for each AMC module with index iAMC 
      // bits determine if first block, intermediate block, last block or single block
      if ( amc_header_info[iAMC].AMCMoreBit && (!amc_header_info[iAMC].AMCSegBit) )
    {
      blockdatabytes = 32768;
      dbprintf("M=1,S=0 first block in segment, set size to 0x%08x bytes (odb 0x%08x)\n", 
           blockdatabytes, amc13_amc13_odb.amc_block_size);
    }
      if ( amc_header_info[iAMC].AMCMoreBit && amc_header_info[iAMC].AMCSegBit )
    {
      dbprintf("M=1,S=1 intermediate block in segment, set size from amc header word\n");
      blockdatabytes = sizeof(uint64_t)*amc_header_info[iAMC].AMCEventSize;
    }
      if ( (!amc_header_info[iAMC].AMCMoreBit) && amc_header_info[iAMC].AMCSegBit )
    {
      dbprintf("M=0,S=1 last block in segment, set size from amc header word\n");
      blockdatabytes = sizeof(uint64_t)*amc_header_info[iAMC].AMCEventSize;	  
    }
      if ( (!amc_header_info[iAMC].AMCMoreBit) && (!amc_header_info[iAMC].AMCSegBit) )
    {
      dbprintf("M=0,S=0 only block in segment, set size from amc header word\n");
      blockdatabytes = sizeof(uint64_t)*amc_header_info[iAMC].AMCEventSize;
    }

      // calculated the location to put the data from block structure in AMC13 event
      dataoffset = dataAMCoffset[amc_header_info[iAMC].AMCSlotNum-1] + datablockoffset[amc_header_info[iAMC].AMCSlotNum-1];
      dbprintf("%s(%d): blockcount %d, iAMC %d, calculated AMC+Block data offset 0x%08x block data bytes 0x%08x data bytes total 0x%08x\n", 
           __func__, __LINE__, blockcount, iAMC, dataoffset, blockdatabytes, totaldatabytes); 

      // read the data block for each AMC module in array tcp_buf_gl[bufIndex]
      retval  = ReadXBytes( clientsockfd, blockdatabytes, (void*)( tcp_buf_gl[bufIndex] + dataoffset ) ,block_status);
    if ( retval < blockdatabytes) 
    {                                                                                    
      cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,blockdatabytes);  
      return -1;                                                                  
    }
      dbprintf("%s(%d): done reading AMC block %i bytes %i, dataoffset %d, (tcp_buf_gl[bufIndex] + dataoffset ) %p, data[0] 0x%16lx data[1] 0x%16lx\n", 
           __func__, __LINE__, blockcount, blockdatabytes, dataoffset, ( tcp_buf_gl[bufIndex] + dataoffset ), 
           *( tcp_buf_gl[bufIndex] + dataoffset ), *( tcp_buf_gl[bufIndex] + dataoffset + 1 ) ); 

      //dataoffset += blockdatabytes/sizeof(uint64_t); // redundant so removed?
      datablockoffset[amc_header_info[iAMC].AMCSlotNum-1] += blockdatabytes/sizeof(uint64_t); // datablockoffset[i] is individual payload readout from ith AMC module
      totaldatabytes += blockdatabytes; // totaldatabytes is total payload readout from all AMC modules
      dbprintf("%s(%d): end of read loop for amc %i\n",__func__, __LINE__,iAMC);
    }

    // read single 64-bit AMC13 block trailer word
    retval = ReadXBytes( clientsockfd, sizeof(uint64_t), (void*)( offsetamc13info ) ,block_status);
    if ( retval < int(sizeof(uint64_t))) 
      {                                                                                    
    cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,sizeof(uint64_t));  
    return -1;                                                                  
      }
    dbprintf("%s(%d): done reading AMC block %i, trailer word *tmp 0x%08lx\n", 
         __func__, __LINE__, blockcount, *offsetamc13info); 

    offsetamc13info++;
    blockcount++;
  }
  dbprintf("%s(%d): finished data read / unpack, databytes total 0x%08x block count %i\n", 
       __func__, __LINE__, totaldatabytes, blockcount); 

  // get CDF trailer word
  retval = ReadXBytes( clientsockfd, tailbytes, (void*)(tail) ,block_status);
  if ( retval < int(tailbytes)) 
    {                                                                                    
      cm_msg(MERROR, __FILE__, "Error when reading from socket, fd %d. Read %d bytes vs %d", clientsockfd,retval,tailbytes);  
      return -1;                                                                  
    }

#ifdef DEBUG
  OverallSize = getOverallSize( be64toh(tail[0]) );
#endif

  dbprintf("%s(%d): read trailer, trailer size [bytes] %d, tail[0] 0x%016lX, EODdelimiter 0x%016lX, EODmask 0x%016lX, Overall Size %i\n", 
       __func__, __LINE__, tailbytes, be64toh(tail[0]), EODdelimiter, EODmask, OverallSize);

#if 0 // turn on/off CPU-based byte-reordering in 8-byte AMC13 words 

  // re-order data from network / big-endian to little-endian
  struct timeval tbeforeReorderBytes, tafterReorderBytes;
  gettimeofday( &tbeforeReorderBytes, NULL);

  int iReorderBytes, nReorderBytes = totaldatabytes / sizeof(uint64_t);
  for (iReorderBytes = 0; iReorderBytes < nReorderBytes; iReorderBytes++){
    tcp_buf_gl[bufIndex][iReorderBytes] = be64toh( tcp_buf_gl[bufIndex][iReorderBytes] );
  }

  gettimeofday( &tafterReorderBytes, NULL);
  dbprintf("%s(%d): duration of byte re-ordering, buffer[%d], fill %d, duration %e us \n", 
     __func__, __LINE__, bufIndex, TCPfillnumber, toddiff( &tafterReorderBytes, &tbeforeReorderBytes) );
#endif 

  return totaldatabytes;
}

Some checks are made in the main thread:

 gettimeofday( &tfinishread, NULL);

    //Test print of the fill number
    //printf("AMC13 Fill number = %d ; TCP Fill number = %d \n",getEventIndex( be64toh( header[0] ) ),int(TCPfillnumber));

    //Check if there are data readout correctly
    if (databytes == 0)
    {
      //skip this iteration if there are no data available
      pthread_mutex_unlock( &mutex_TCP_buf[bufIndex] );  
      continue; 
    }

    if (databytes < 0)
    {
      //terminate the while loop if there is an read error
      read_error = true;
      pthread_mutex_unlock( &mutex_TCP_buf[bufIndex] );  
      cm_msg(MERROR, __FILE__,"tcp_thread: break the tcp thread loop becuase of a reading error %d", databytes);
      break;
    }

    if ( toddiff( &tfinishread, &tbeginread) > 100000.) {
      printf("WARNING tcpip stall, readAndUnpack > 100ms!");
      printf("%s(%d): duration of readAndUnpack, read %d bytes, time = %e us \n", 
      __func__, __LINE__, databytes , toddiff( &tfinishread, &tbeginread) );
    }

    amc13infobytes = (uint64_t)offsetamc13info - (uint64_t)amc13info; 
    trigger_info.time_tcp_finish_header_read_s = header[3];
    trigger_info.time_tcp_finish_header_read_us = header[4];

    // get time done read / unpack of AMC13 event 
    status = gettimeofday( &tdata, NULL);
    header[5] = tdata.tv_sec; // fill data time info in header
    header[6] = tdata.tv_usec; // fill data time info in header

 gettimeofday( &tfinishread, NULL);

    //Test print of the fill number
    //printf("AMC13 Fill number = %d ; TCP Fill number = %d \n",getEventIndex( be64toh( header[0] ) ),int(TCPfillnumber));

    //Check if there are data readout correctly
    if (databytes == 0)
    {
      //skip this iteration if there are no data available
      pthread_mutex_unlock( &mutex_TCP_buf[bufIndex] );  
      continue; 
    }

    if (databytes < 0)
    {
      //terminate the while loop if there is an read error
      read_error = true;
      pthread_mutex_unlock( &mutex_TCP_buf[bufIndex] );  
      cm_msg(MERROR, __FILE__,"tcp_thread: break the tcp thread loop becuase of a reading error %d", databytes);
      break;
    }

    if ( toddiff( &tfinishread, &tbeginread) > 100000.) {
      printf("WARNING tcpip stall, readAndUnpack > 100ms!");
      printf("%s(%d): duration of readAndUnpack, read %d bytes, time = %e us \n", 
      __func__, __LINE__, databytes , toddiff( &tfinishread, &tbeginread) );
    }

    amc13infobytes = (uint64_t)offsetamc13info - (uint64_t)amc13info; 
    trigger_info.time_tcp_finish_header_read_s = header[3];
    trigger_info.time_tcp_finish_header_read_us = header[4];

    // get time done read / unpack of AMC13 event 
    status = gettimeofday( &tdata, NULL);
    header[5] = tdata.tv_sec; // fill data time info in header
    header[6] = tdata.tv_usec; // fill data time info in header

GPU Proc Unlocked - got TCP Header Word 2

Some checks are made to see if data is recieved, variables initialized:

//Check TCPfillnumber and makesure TCPfillnumber is greater
unsigned long TCPfillnumber_local;
unsigned long GPUfillnumber_local; //bor function can change the global fill number
unsigned long Midasfillnumber_local;
int local_thread_active = 0;
int local_thread_read = 0;

pthread_mutex_lock( &mutex_TCP_general );
TCPfillnumber_local = TCPfillnumber;
pthread_mutex_unlock( &mutex_TCP_general );

pthread_mutex_lock( &mutex_GPU_general );
GPUfillnumber_local = GPUfillnumber;
local_thread_active = gpu_thread_active;
local_thread_read = gpu_thread_read;
pthread_mutex_unlock( &mutex_GPU_general );

pthread_mutex_lock(&mutex_midas);
Midasfillnumber_local = Midasfillnumber;
pthread_mutex_unlock(&mutex_midas);

if (!local_thread_active)
{
  break;
}

if (!local_thread_read)
{
  usleep(100);
  continue;
}
 if (GPUfillnumber_local == TCPfillnumber_local || TCPfillnumber_local == 0)
    {
      dbprintf("%s(%d): No new events in the TCP buffer \n", __func__, __LINE__ );
      usleep(100);
      continue;
    }

    unsigned long tcp_buffer_filled = 0;
    if (TCPfillnumber_local > GPUfillnumber_local)
    {
      tcp_buffer_filled = TCPfillnumber_local - GPUfillnumber_local;
    }else{
      tcp_buffer_filled = 0xffffffffffffffff - (GPUfillnumber_local - TCPfillnumber_local) +1 ;
    }
    dbprintf("%s(%d): tcp_ring_buffer_size %d \n", __func__, __LINE__, tcp_buffer_filled );

    dbprintf("%s(%d): tcp fill %d gpu fill %d \n", __func__, __LINE__, TCPfillnumber_local , GPUfillnumber_local );

    float BufLoad = tcp_buffer_filled * 1.0 / TCP_BUF_MAX_FILLS;
    float BufLoadThreshold = 0.9;
    if (BufLoad > BufLoadThreshold && !BufFullAlarmTriggered)
    { 
      BufFullAlarmTriggered = true;
      char AlarmMsg[500];
      sprintf(AlarmMsg,"DAQ | AMC13%03d TCP Ring buffer close to full (%f%%)",frontend_index,BufLoad*100);

      int ret_code = al_trigger_alarm("Frontend TCP Buffer Error", AlarmMsg, "Warning", "Frontend TCP Buffer Error", AT_INTERNAL); 
      if (ret_code != AL_SUCCESS) {
    cm_msg(MERROR, __FILE__, "Failure Raising Alarm: Error %d, Alarm \"%s\"", ret_code, "Frontend TCP Buffer Error"     );
      }
    }
    if (BufLoad < BufLoadThreshold && BufFullAlarmTriggered)
    { 
      BufFullAlarmTriggered = false;
      char AlarmMsg[500];
      sprintf(AlarmMsg,"DAQ | AMC13%03d TCP Ring buffer returns normal (%f%%)",frontend_index,BufLoad*100);

      int ret_code = al_trigger_alarm("Frontend TCP Buffer Recovery", AlarmMsg, "Recovery", "Frontend TCP Buffer Recovery", AT_INTERNAL); 
      if (ret_code != AL_SUCCESS) {
    cm_msg(MERROR, __FILE__, "Failure Raising Alarm: Error %d, Alarm \"%s\"", ret_code, "Frontend TCP Buffer Recovery"     );
      }
    }

    unsigned long gpu_buffer_filled = 0;
    if (GPUfillnumber_local > Midasfillnumber_local)
    {
      gpu_buffer_filled = GPUfillnumber_local - Midasfillnumber_local;
    }else{
      gpu_buffer_filled = 0xffffffffffffffff - (Midasfillnumber_local - GPUfillnumber_local) +1 ;
    }
    dbprintf("%s(%d): gpu_ring_buffer_size %d \n", __func__, __LINE__, gpu_buffer_filled );

    dbprintf("%s(%d): gpu fill %d midas fill %d \n", __func__, __LINE__, GPUfillnumber_local , Midasfillnumber_local );

    float GPUBufLoad = gpu_buffer_filled * 1.0 / GPU_BUFFER_SIZE;
    float GPUBufLoadThreshold = 0.9;
    if (GPUBufLoad > GPUBufLoadThreshold && !GPUBufFullAlarmTriggered)
    { 
      GPUBufFullAlarmTriggered = true;
      char AlarmMsg[500];
      sprintf(AlarmMsg,"DAQ | AMC13%03d GPU Ring buffer close to full (%f%%)",frontend_index,GPUBufLoad*100);

      int ret_code = al_trigger_alarm("Frontend GPU Buffer Error", AlarmMsg, "Warning", "Frontend GPU Buffer Error", AT_INTERNAL); 
      if (ret_code != AL_SUCCESS) {
    cm_msg(MERROR, __FILE__, "Failure Raising Alarm: Error %d, Alarm \"%s\"", ret_code, "Frontend GPU Buffer Error"     );
      }
    }
    if (GPUBufLoad < GPUBufLoadThreshold && GPUBufFullAlarmTriggered)
    { 
      GPUBufFullAlarmTriggered = false;
      char AlarmMsg[500];
      sprintf(AlarmMsg,"DAQ | AMC13%03d GPU Ring buffer returns normal (%f%%)",frontend_index,GPUBufLoad*100);

      int ret_code = al_trigger_alarm("Frontend GPU Buffer Recovery", AlarmMsg, "Recovery", "Frontend  GPU Buffer Recovery", AT_INTERNAL); 
      if (ret_code != AL_SUCCESS) {
    cm_msg(MERROR, __FILE__, "Failure Raising Alarm: Error %d, Alarm \"%s\"", ret_code, "Frontend GPU Buffer Recovery"     );
      }
    }

    //Do not proceed if the GPU buffer is full
    if ( (gpu_buffer_filled >= GPU_BUFFER_SIZE - 1) || (tcp_buffer_filled >= TCP_BUF_MAX_FILLS - 1) )
    {
      fc7help->setThrottleTriggers( encoder_fc7, frontend_index, 1);
      triggersThrottled = true;
      cm_msg(MINFO, __FILE__, "Requesting Encoder FC7 to throttle TTC triggers to clear TCP/GPU ring buffers");
      continue;
    } else if ( triggersThrottled ) {
      fc7help->setThrottleTriggers( encoder_fc7, frontend_index, 0);
      triggersThrottled = false;
      cm_msg(MINFO, __FILE__, "Trigger throttling removed");
    }

    // calculate TCP ring buffer index from GPU fill number
    TCPbufferindex = GPUfillnumber_local%TCP_BUF_MAX_FILLS;
    dbprintf("%s(%d): start new fill %d, buffer %d\n", __func__, __LINE__, GPUfillnumber_local, TCPbufferindex );

    // calculate the GPU ring buffer index 
    GPUbufferindex = GPUfillnumber_local % GPU_BUFFER_SIZE;

    //Lock GPU buffer unit
    pthread_mutex_lock( &mutex_GPU_buf[GPUbufferindex] );
    dbprintf("%s(%d): got lock to write to GPU buffers %d, \n", 
    __func__, __LINE__, GPUbufferindex  );

    // get start time for GPU thread processing
    gettimeofday( &tstart, NULL);
    trigger_info.time_gputhread_started_s = tstart.tv_sec; 
    trigger_info.time_gputhread_started_us = tstart.tv_usec; 
    //These has to be done after the memory copy
    //TODO: Check DATA 
    //GPU_Data_Buffer[GPUbufferindex].gpu_data_header[7] = tstart.tv_sec; 
    //GPU_Data_Buffer[GPUbufferindex].gpu_data_header[8] = tstart.tv_usec; 

    // use lock to access the tcp_thread buffers - tcp_buf_gl[i], tcp_buf_header_gl[i], tcp_buf_tail_gl[i]
    pthread_mutex_lock( &mutex_TCP_buf[TCPbufferindex] );
    dbprintf("%s(%d): got lock to read from TCP output buffers, *tcp_buf_header_gl[%d] = 0x%08x\n", 
    __func__, __LINE__, TCPbufferindex, be32toh ( *tcp_buf_header_gl[TCPbufferindex] )  );

    // get AMC13 event index from data header ( ugly fix for 64-bit AMC words )
#ifdef DEBUG
    AMC13fillcounter = ( be32toh ( *tcp_buf_header_gl[TCPbufferindex] ) & 0x00FFFFFF ); 
#endif

#ifdef USE_GPU 
#ifdef TIME_MEASURE_DEF 
    cudaEvent_t start, stop;
    float elapsedTime;
    cudaEventCreate(&start);
    cudaEventCreate(&stop);
    cudaEventRecord(start, 0);
#endif // USE_GPU
#endif // TIME_MEASURE_DEF

    dbprintf("%s(%d): got lock to write to GPU output buffers, fill %d\n", __func__, __LINE__, GPUfillnumber_local);

    // set GPU_thread data sizes from TCP_thread data sizes and ODB parameters 
    GPU_Data_Buffer[GPUbufferindex].gpu_data_header_amc13_size = TCPtotalamc13infosize[TCPbufferindex]; // AMC13 headers / trailers
    GPU_Data_Buffer[GPUbufferindex].gpu_data_header_size = TCPtotalheadersize[TCPbufferindex]; // timing / performance data
    GPU_Data_Buffer[GPUbufferindex].gpu_data_tail_size = TCPtotaltailsize[TCPbufferindex]; // CDF 64-bit trailer word
    GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size = TCPtotaldatasize[TCPbufferindex]; // raw, unpacked AMC payload

    // copy header, trailer amc13info for every fill
    memcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_header, tcp_buf_header_gl[TCPbufferindex], GPU_Data_Buffer[GPUbufferindex].gpu_data_header_size );
    //Add the GPU processing start time stamp
    GPU_Data_Buffer[GPUbufferindex].gpu_data_header[7] = tstart.tv_sec; 
    GPU_Data_Buffer[GPUbufferindex].gpu_data_header[8] = tstart.tv_usec;

//Check TCPfillnumber and makesure TCPfillnumber is greater
unsigned long TCPfillnumber_local;
unsigned long GPUfillnumber_local; //bor function can change the global fill number
unsigned long Midasfillnumber_local;
int local_thread_active = 0;
int local_thread_read = 0;

pthread_mutex_lock( &mutex_TCP_general );
TCPfillnumber_local = TCPfillnumber;
pthread_mutex_unlock( &mutex_TCP_general );

pthread_mutex_lock( &mutex_GPU_general );
GPUfillnumber_local = GPUfillnumber;
local_thread_active = gpu_thread_active;
local_thread_read = gpu_thread_read;
pthread_mutex_unlock( &mutex_GPU_general );

pthread_mutex_lock(&mutex_midas);
Midasfillnumber_local = Midasfillnumber;
pthread_mutex_unlock(&mutex_midas);

if (!local_thread_active)
{
  break;
}

if (!local_thread_read)
{
  usleep(100);
  continue;
}
 if (GPUfillnumber_local == TCPfillnumber_local || TCPfillnumber_local == 0)
    {
      dbprintf("%s(%d): No new events in the TCP buffer \n", __func__, __LINE__ );
      usleep(100);
      continue;
    }

    unsigned long tcp_buffer_filled = 0;
    if (TCPfillnumber_local > GPUfillnumber_local)
    {
      tcp_buffer_filled = TCPfillnumber_local - GPUfillnumber_local;
    }else{
      tcp_buffer_filled = 0xffffffffffffffff - (GPUfillnumber_local - TCPfillnumber_local) +1 ;
    }
    dbprintf("%s(%d): tcp_ring_buffer_size %d \n", __func__, __LINE__, tcp_buffer_filled );

    dbprintf("%s(%d): tcp fill %d gpu fill %d \n", __func__, __LINE__, TCPfillnumber_local , GPUfillnumber_local );

    float BufLoad = tcp_buffer_filled * 1.0 / TCP_BUF_MAX_FILLS;
    float BufLoadThreshold = 0.9;
    if (BufLoad > BufLoadThreshold && !BufFullAlarmTriggered)
    { 
      BufFullAlarmTriggered = true;
      char AlarmMsg[500];
      sprintf(AlarmMsg,"DAQ | AMC13%03d TCP Ring buffer close to full (%f%%)",frontend_index,BufLoad*100);

      int ret_code = al_trigger_alarm("Frontend TCP Buffer Error", AlarmMsg, "Warning", "Frontend TCP Buffer Error", AT_INTERNAL); 
      if (ret_code != AL_SUCCESS) {
    cm_msg(MERROR, __FILE__, "Failure Raising Alarm: Error %d, Alarm \"%s\"", ret_code, "Frontend TCP Buffer Error"     );
      }
    }
    if (BufLoad < BufLoadThreshold && BufFullAlarmTriggered)
    { 
      BufFullAlarmTriggered = false;
      char AlarmMsg[500];
      sprintf(AlarmMsg,"DAQ | AMC13%03d TCP Ring buffer returns normal (%f%%)",frontend_index,BufLoad*100);

      int ret_code = al_trigger_alarm("Frontend TCP Buffer Recovery", AlarmMsg, "Recovery", "Frontend TCP Buffer Recovery", AT_INTERNAL); 
      if (ret_code != AL_SUCCESS) {
    cm_msg(MERROR, __FILE__, "Failure Raising Alarm: Error %d, Alarm \"%s\"", ret_code, "Frontend TCP Buffer Recovery"     );
      }
    }

    unsigned long gpu_buffer_filled = 0;
    if (GPUfillnumber_local > Midasfillnumber_local)
    {
      gpu_buffer_filled = GPUfillnumber_local - Midasfillnumber_local;
    }else{
      gpu_buffer_filled = 0xffffffffffffffff - (Midasfillnumber_local - GPUfillnumber_local) +1 ;
    }
    dbprintf("%s(%d): gpu_ring_buffer_size %d \n", __func__, __LINE__, gpu_buffer_filled );

    dbprintf("%s(%d): gpu fill %d midas fill %d \n", __func__, __LINE__, GPUfillnumber_local , Midasfillnumber_local );

    float GPUBufLoad = gpu_buffer_filled * 1.0 / GPU_BUFFER_SIZE;
    float GPUBufLoadThreshold = 0.9;
    if (GPUBufLoad > GPUBufLoadThreshold && !GPUBufFullAlarmTriggered)
    { 
      GPUBufFullAlarmTriggered = true;
      char AlarmMsg[500];
      sprintf(AlarmMsg,"DAQ | AMC13%03d GPU Ring buffer close to full (%f%%)",frontend_index,GPUBufLoad*100);

      int ret_code = al_trigger_alarm("Frontend GPU Buffer Error", AlarmMsg, "Warning", "Frontend GPU Buffer Error", AT_INTERNAL); 
      if (ret_code != AL_SUCCESS) {
    cm_msg(MERROR, __FILE__, "Failure Raising Alarm: Error %d, Alarm \"%s\"", ret_code, "Frontend GPU Buffer Error"     );
      }
    }
    if (GPUBufLoad < GPUBufLoadThreshold && GPUBufFullAlarmTriggered)
    { 
      GPUBufFullAlarmTriggered = false;
      char AlarmMsg[500];
      sprintf(AlarmMsg,"DAQ | AMC13%03d GPU Ring buffer returns normal (%f%%)",frontend_index,GPUBufLoad*100);

      int ret_code = al_trigger_alarm("Frontend GPU Buffer Recovery", AlarmMsg, "Recovery", "Frontend  GPU Buffer Recovery", AT_INTERNAL); 
      if (ret_code != AL_SUCCESS) {
    cm_msg(MERROR, __FILE__, "Failure Raising Alarm: Error %d, Alarm \"%s\"", ret_code, "Frontend GPU Buffer Recovery"     );
      }
    }

    //Do not proceed if the GPU buffer is full
    if ( (gpu_buffer_filled >= GPU_BUFFER_SIZE - 1) || (tcp_buffer_filled >= TCP_BUF_MAX_FILLS - 1) )
    {
      fc7help->setThrottleTriggers( encoder_fc7, frontend_index, 1);
      triggersThrottled = true;
      cm_msg(MINFO, __FILE__, "Requesting Encoder FC7 to throttle TTC triggers to clear TCP/GPU ring buffers");
      continue;
    } else if ( triggersThrottled ) {
      fc7help->setThrottleTriggers( encoder_fc7, frontend_index, 0);
      triggersThrottled = false;
      cm_msg(MINFO, __FILE__, "Trigger throttling removed");
    }

    // calculate TCP ring buffer index from GPU fill number
    TCPbufferindex = GPUfillnumber_local%TCP_BUF_MAX_FILLS;
    dbprintf("%s(%d): start new fill %d, buffer %d\n", __func__, __LINE__, GPUfillnumber_local, TCPbufferindex );

    // calculate the GPU ring buffer index 
    GPUbufferindex = GPUfillnumber_local % GPU_BUFFER_SIZE;

    //Lock GPU buffer unit
    pthread_mutex_lock( &mutex_GPU_buf[GPUbufferindex] );
    dbprintf("%s(%d): got lock to write to GPU buffers %d, \n", 
    __func__, __LINE__, GPUbufferindex  );

    // get start time for GPU thread processing
    gettimeofday( &tstart, NULL);
    trigger_info.time_gputhread_started_s = tstart.tv_sec; 
    trigger_info.time_gputhread_started_us = tstart.tv_usec; 
    //These has to be done after the memory copy
    //TODO: Check DATA 
    //GPU_Data_Buffer[GPUbufferindex].gpu_data_header[7] = tstart.tv_sec; 
    //GPU_Data_Buffer[GPUbufferindex].gpu_data_header[8] = tstart.tv_usec; 

    // use lock to access the tcp_thread buffers - tcp_buf_gl[i], tcp_buf_header_gl[i], tcp_buf_tail_gl[i]
    pthread_mutex_lock( &mutex_TCP_buf[TCPbufferindex] );
    dbprintf("%s(%d): got lock to read from TCP output buffers, *tcp_buf_header_gl[%d] = 0x%08x\n", 
    __func__, __LINE__, TCPbufferindex, be32toh ( *tcp_buf_header_gl[TCPbufferindex] )  );

    // get AMC13 event index from data header ( ugly fix for 64-bit AMC words )
#ifdef DEBUG
    AMC13fillcounter = ( be32toh ( *tcp_buf_header_gl[TCPbufferindex] ) & 0x00FFFFFF ); 
#endif

#ifdef USE_GPU 
#ifdef TIME_MEASURE_DEF 
    cudaEvent_t start, stop;
    float elapsedTime;
    cudaEventCreate(&start);
    cudaEventCreate(&stop);
    cudaEventRecord(start, 0);
#endif // USE_GPU
#endif // TIME_MEASURE_DEF

    dbprintf("%s(%d): got lock to write to GPU output buffers, fill %d\n", __func__, __LINE__, GPUfillnumber_local);

    // set GPU_thread data sizes from TCP_thread data sizes and ODB parameters 
    GPU_Data_Buffer[GPUbufferindex].gpu_data_header_amc13_size = TCPtotalamc13infosize[TCPbufferindex]; // AMC13 headers / trailers
    GPU_Data_Buffer[GPUbufferindex].gpu_data_header_size = TCPtotalheadersize[TCPbufferindex]; // timing / performance data
    GPU_Data_Buffer[GPUbufferindex].gpu_data_tail_size = TCPtotaltailsize[TCPbufferindex]; // CDF 64-bit trailer word
    GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size = TCPtotaldatasize[TCPbufferindex]; // raw, unpacked AMC payload

    // copy header, trailer amc13info for every fill
    memcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_header, tcp_buf_header_gl[TCPbufferindex], GPU_Data_Buffer[GPUbufferindex].gpu_data_header_size );
    //Add the GPU processing start time stamp
    GPU_Data_Buffer[GPUbufferindex].gpu_data_header[7] = tstart.tv_sec; 
    GPU_Data_Buffer[GPUbufferindex].gpu_data_header[8] = tstart.tv_usec;

4.GPU Copy Done - GPU Proc Unlocked
1. Some cuda memcopies and such:

    dbprintf("%s(%d): copied header databank  [size=0x%08x], header[0] 0x%08x, readout fill number %d, GPU fill number %d\n", 
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_header_size, be32toh(GPU_Data_Buffer[GPUbufferindex].gpu_data_header[0]), AMC13fillcounter, GPUfillnumber_local );
    memcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_tail, tcp_buf_tail_gl[TCPbufferindex], TCPtotaltailsize[TCPbufferindex] );
    dbprintf("%s(%d): copied tail databank  [size=0x%08x], tail[0] 0x%08x, readout fill number %d, GPU fill number %d\n", 
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_tail_size, be32toh(GPU_Data_Buffer[GPUbufferindex].gpu_data_tail[0]), AMC13fillcounter, GPUfillnumber_local );
    memcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_header_amc13, tcp_buf_amc13_gl[TCPbufferindex], TCPtotalamc13infosize[TCPbufferindex] );
    dbprintf("%s(%d): copied amc13 databank  [size=0x%08x], amc13[0] 0x%08x, readout fill number %d, GPU fill number %d\n", 
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_header_amc13_size, be32toh(GPU_Data_Buffer[GPUbufferindex].gpu_data_header_amc13[0]), AMC13fillcounter, GPUfillnumber_local );

    //  extract / copy rider header / trailer data from raw payload to rider header / trailer array  (call arguments mirror memcpy)
    gettimeofday( &tbeforeextract, NULL);

    GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider_size = extractRiderHeader( GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider, tcp_buf_gl[TCPbufferindex], GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size ); 
    dbprintf("%s(%d): copied rider databank[%d], rider[first] 0x%16lx, rider[last] 0x%16lx, readout fill number %d, GPU fill number %d\n",
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider_size, *(GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider),
    *(GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider+(GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider_size/sizeof(uint64_t))-1), AMC13fillcounter, GPUfillnumber_local );

    gettimeofday( &tafterextract, NULL);
    dbprintf("%s(%d): duration of extract and copy of rider headers, fdt = %e us \n", 
    __func__, __LINE__, toddiff( &tafterextract, &tbeforeextract) );

    // extract the FillType etc from rider header / trailers words 
    int indexModHeaderWord2 = 1; // using module header word
    u_int64_t ModHeader2 = be64toh ( GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider[indexModHeaderWord2] );
    u_int64_t ModUserBitMask = ModUserBitField << ModUserBitOffset;
    int UserField = ( ( ModHeader2 & ModUserBitMask ) >> ModUserBitOffset  ); // from Rider User Manual, June 17 2015
    int ModFillType = UserField & 0x7;  
    dbprintf("%s(%d): 64-bit Mod header word 0x%016lx after be64toh 0x%016lx and ModFillType 0x%04x\n",
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider[indexModHeaderWord2], ModHeader2, ModFillType);

    /*
    // 8/14/2017, TG, skip the identification of the fill length from the channel headers. This won't work
    // for async WFD5s with muon/laser fills and sync WFD5s with async fills.. The calculated variables 
    // ChanFillType and WfrmFillType were only used to verify the fill type extracted from the module header

    int indexChanHeaderWord2 = 3; // using channel header word
    u_int64_t ChanHeader2 = be64toh ( gpu_data_header_rider[indexChanHeaderWord2] );
    u_int64_t ChanFTBitMask = ChanFTBitField << ChanFTBitOffset;
    int ChanFillType = ( ( ChanHeader2 & ChanFTBitMask ) >> ChanFTBitOffset  ); // from Rider User Manual, June 17 2015
    dbprintf("%s(%d): 64-bit Chan header word 0x%016lx after be64toh 0x%016lx and chan fill type 0x%04x\n",
    __func__, __LINE__, gpu_data_header_rider[indexChanHeaderWord2], ChanHeader2, ChanFillType);

    int indexWfrmHeaderWord1 = 4; // using waveform header word
    u_int64_t WfrmHeader1 = be64toh ( gpu_data_header_rider[indexWfrmHeaderWord1] );
    u_int64_t WfrmFTBitMask = WfrmFTBitField << WfrmFTBitOffset;
    int WfrmFillType = ( ( WfrmHeader1 & WfrmFTBitMask ) >> WfrmFTBitOffset  ); // from Rider User Manual, June 17 2015
    dbprintf("%s(%d): 64-bit Wfrm header word 0x%016lx after be64toh 0x%016lx and wfrm fill type 0x%04x\n",
    __func__, __LINE__, gpu_data_header_rider[indexWfrmHeaderWord1], WfrmHeader1, WfrmFillType);
     */

    bool process_laser = false;
    for(int ii=0;ii<4;ii++){
      if(tq_parameters_odb[ii].fill_type==2) process_laser=true;
    }
    // copy raw data for pre-scaled muon fills or always of laser/pededstal type fill 
    //if ( ModFillType>1 || ( amc13_settings_odb.store_raw && !((AMC13fillcounter-1)%amc13_settings_odb.prescale_raw) ) )

    //printf("ModFillType = %i, amc13_settings_odb.store_raw = %i, GPUmuonfillnumber = %i\n",ModFillType, amc13_settings_odb.store_raw, GPUmuonfillnumber);
    //printf("store_raw = %i, GPUmuonfillnumber = %i, amc13_settings_odb.prescale_raw = %i, check = %i\n",amc13_settings_odb.store_raw,GPUmuonfillnumber,amc13_settings_odb.prescale_raw,!GPUmuonfillnumber%amc13_settings_odb.prescale_raw ); 
    if ( frontend_index==local_encoder_crate || ModFillType>2 || (ModFillType==2 && !process_laser) || ( amc13_settings_odb.store_raw && !GPUmuonfillnumber%amc13_settings_odb.prescale_raw  ) )
    {
      memcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_raw, tcp_buf_gl[TCPbufferindex], GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size );

      dbprintf("%s(%d): copied raw databank  [size=0x%08x], raw[0] 0x%04x, raw[1] 0x%04x, raw[2] 0x%04x, raw[3] 0x%04x, readout fill number %d, GPU fill number %d, , GPU muon fill number %d\n", 
      __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size, *GPU_Data_Buffer[GPUbufferindex].gpu_data_raw, *(GPU_Data_Buffer[GPUbufferindex].gpu_data_raw+1), *(GPU_Data_Buffer[GPUbufferindex].gpu_data_raw+2), *(GPU_Data_Buffer[GPUbufferindex].gpu_data_raw+3), AMC13fillcounter, GPUfillnumber_local, GPUmuonfillnumber );
    }

#ifdef USE_GPU  

    // for muon type fill and any TQ processing switched on copy data to GPU 
    if ( (ModFillType==1 || (ModFillType==2 && process_laser)) && Any_processing_on ) {

      if ( GPU_IBUF_SIZE < GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size )
      {
    printf("%s(%d): fill is too large (%d bytes) for GPU buffer (%d bytes) \n", 
        __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size, GPU_IBUF_SIZE );
    GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size = 1;
      }      

      dbprintf("%s(%d): *** GPU input data[0], data[0]: %li %li total size %d\n", 
      __func__, __LINE__, *(tcp_buf_gl[TCPbufferindex]), *(tcp_buf_gl[TCPbufferindex]), GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size);

      // copy raw AMC payload data to GPU
      cudaCopyStatus = cudaMemcpy( gpu_idata, tcp_buf_gl[TCPbufferindex], GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size,  cudaMemcpyHostToDevice);
      if ( cudaCopyStatus != cudaSuccess )
      {
    printf("cudaMemcpy of input data FAIL, status: %d error: %s bytes: %d\n", cudaCopyStatus, cudaGetErrorString(cudaCopyStatus), GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size);
    if ( cudaCopyStatus == cudaErrorInvalidValue  ) printf("cudaErrorInvalidValue !\n");
    if ( cudaCopyStatus == cudaErrorInvalidDevicePointer ) printf("cudaErrorInvalidDevicePointer!\n");
      }

#ifdef TIME_MEASURE_DEF
      cudaEventRecord(stop, 0);
      cudaEventSynchronize(stop);
      cudaEventElapsedTime(&elapsedTime, start, stop);
      dbprintf("%s(%d): copied data from CPU (pntr %p) to GPU (pntr %p), size %d, time %f ms\n",
      __func__, __LINE__, tcp_buf_gl[TCPbufferindex], gpu_idata, GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size, elapsedTime);
      cudaEventDestroy(start);
      cudaEventDestroy(stop);
#endif // TIME_MEASURE_DEF	
    } // end cuda copy from host to device (if Any_processing_on is true)

    // get GPU copy time for GPU thread
    gettimeofday( &tcopy, NULL);
    dbprintf("%s(%d): duration of start to copy, fdt = %e us \n", __func__, __LINE__, toddiff( &tstart, &tcopy) );
    trigger_info.time_gputhread_copytogpu_done_s = tcopy.tv_sec;
    trigger_info.time_gputhread_copytogpu_done_us = tcopy.tv_usec;

    dbprintf("%s(%d): copied header databank  [size=0x%08x], header[0] 0x%08x, readout fill number %d, GPU fill number %d\n", 
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_header_size, be32toh(GPU_Data_Buffer[GPUbufferindex].gpu_data_header[0]), AMC13fillcounter, GPUfillnumber_local );
    memcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_tail, tcp_buf_tail_gl[TCPbufferindex], TCPtotaltailsize[TCPbufferindex] );
    dbprintf("%s(%d): copied tail databank  [size=0x%08x], tail[0] 0x%08x, readout fill number %d, GPU fill number %d\n", 
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_tail_size, be32toh(GPU_Data_Buffer[GPUbufferindex].gpu_data_tail[0]), AMC13fillcounter, GPUfillnumber_local );
    memcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_header_amc13, tcp_buf_amc13_gl[TCPbufferindex], TCPtotalamc13infosize[TCPbufferindex] );
    dbprintf("%s(%d): copied amc13 databank  [size=0x%08x], amc13[0] 0x%08x, readout fill number %d, GPU fill number %d\n", 
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_header_amc13_size, be32toh(GPU_Data_Buffer[GPUbufferindex].gpu_data_header_amc13[0]), AMC13fillcounter, GPUfillnumber_local );

    //  extract / copy rider header / trailer data from raw payload to rider header / trailer array  (call arguments mirror memcpy)
    gettimeofday( &tbeforeextract, NULL);

    GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider_size = extractRiderHeader( GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider, tcp_buf_gl[TCPbufferindex], GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size ); 
    dbprintf("%s(%d): copied rider databank[%d], rider[first] 0x%16lx, rider[last] 0x%16lx, readout fill number %d, GPU fill number %d\n",
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider_size, *(GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider),
    *(GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider+(GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider_size/sizeof(uint64_t))-1), AMC13fillcounter, GPUfillnumber_local );

    gettimeofday( &tafterextract, NULL);
    dbprintf("%s(%d): duration of extract and copy of rider headers, fdt = %e us \n", 
    __func__, __LINE__, toddiff( &tafterextract, &tbeforeextract) );

    // extract the FillType etc from rider header / trailers words 
    int indexModHeaderWord2 = 1; // using module header word
    u_int64_t ModHeader2 = be64toh ( GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider[indexModHeaderWord2] );
    u_int64_t ModUserBitMask = ModUserBitField << ModUserBitOffset;
    int UserField = ( ( ModHeader2 & ModUserBitMask ) >> ModUserBitOffset  ); // from Rider User Manual, June 17 2015
    int ModFillType = UserField & 0x7;  
    dbprintf("%s(%d): 64-bit Mod header word 0x%016lx after be64toh 0x%016lx and ModFillType 0x%04x\n",
    __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_header_rider[indexModHeaderWord2], ModHeader2, ModFillType);

    /*
    // 8/14/2017, TG, skip the identification of the fill length from the channel headers. This won't work
    // for async WFD5s with muon/laser fills and sync WFD5s with async fills.. The calculated variables 
    // ChanFillType and WfrmFillType were only used to verify the fill type extracted from the module header

    int indexChanHeaderWord2 = 3; // using channel header word
    u_int64_t ChanHeader2 = be64toh ( gpu_data_header_rider[indexChanHeaderWord2] );
    u_int64_t ChanFTBitMask = ChanFTBitField << ChanFTBitOffset;
    int ChanFillType = ( ( ChanHeader2 & ChanFTBitMask ) >> ChanFTBitOffset  ); // from Rider User Manual, June 17 2015
    dbprintf("%s(%d): 64-bit Chan header word 0x%016lx after be64toh 0x%016lx and chan fill type 0x%04x\n",
    __func__, __LINE__, gpu_data_header_rider[indexChanHeaderWord2], ChanHeader2, ChanFillType);

    int indexWfrmHeaderWord1 = 4; // using waveform header word
    u_int64_t WfrmHeader1 = be64toh ( gpu_data_header_rider[indexWfrmHeaderWord1] );
    u_int64_t WfrmFTBitMask = WfrmFTBitField << WfrmFTBitOffset;
    int WfrmFillType = ( ( WfrmHeader1 & WfrmFTBitMask ) >> WfrmFTBitOffset  ); // from Rider User Manual, June 17 2015
    dbprintf("%s(%d): 64-bit Wfrm header word 0x%016lx after be64toh 0x%016lx and wfrm fill type 0x%04x\n",
    __func__, __LINE__, gpu_data_header_rider[indexWfrmHeaderWord1], WfrmHeader1, WfrmFillType);
     */

    bool process_laser = false;
    for(int ii=0;ii<4;ii++){
      if(tq_parameters_odb[ii].fill_type==2) process_laser=true;
    }
    // copy raw data for pre-scaled muon fills or always of laser/pededstal type fill 
    //if ( ModFillType>1 || ( amc13_settings_odb.store_raw && !((AMC13fillcounter-1)%amc13_settings_odb.prescale_raw) ) )

    //printf("ModFillType = %i, amc13_settings_odb.store_raw = %i, GPUmuonfillnumber = %i\n",ModFillType, amc13_settings_odb.store_raw, GPUmuonfillnumber);
    //printf("store_raw = %i, GPUmuonfillnumber = %i, amc13_settings_odb.prescale_raw = %i, check = %i\n",amc13_settings_odb.store_raw,GPUmuonfillnumber,amc13_settings_odb.prescale_raw,!GPUmuonfillnumber%amc13_settings_odb.prescale_raw ); 
    if ( frontend_index==local_encoder_crate || ModFillType>2 || (ModFillType==2 && !process_laser) || ( amc13_settings_odb.store_raw && !GPUmuonfillnumber%amc13_settings_odb.prescale_raw  ) )
    {
      memcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_raw, tcp_buf_gl[TCPbufferindex], GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size );

      dbprintf("%s(%d): copied raw databank  [size=0x%08x], raw[0] 0x%04x, raw[1] 0x%04x, raw[2] 0x%04x, raw[3] 0x%04x, readout fill number %d, GPU fill number %d, , GPU muon fill number %d\n", 
      __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size, *GPU_Data_Buffer[GPUbufferindex].gpu_data_raw, *(GPU_Data_Buffer[GPUbufferindex].gpu_data_raw+1), *(GPU_Data_Buffer[GPUbufferindex].gpu_data_raw+2), *(GPU_Data_Buffer[GPUbufferindex].gpu_data_raw+3), AMC13fillcounter, GPUfillnumber_local, GPUmuonfillnumber );
    }

#ifdef USE_GPU  

    // for muon type fill and any TQ processing switched on copy data to GPU 
    if ( (ModFillType==1 || (ModFillType==2 && process_laser)) && Any_processing_on ) {

      if ( GPU_IBUF_SIZE < GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size )
      {
    printf("%s(%d): fill is too large (%d bytes) for GPU buffer (%d bytes) \n", 
        __func__, __LINE__, GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size, GPU_IBUF_SIZE );
    GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size = 1;
      }      

      dbprintf("%s(%d): *** GPU input data[0], data[0]: %li %li total size %d\n", 
      __func__, __LINE__, *(tcp_buf_gl[TCPbufferindex]), *(tcp_buf_gl[TCPbufferindex]), GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size);

      // copy raw AMC payload data to GPU
      cudaCopyStatus = cudaMemcpy( gpu_idata, tcp_buf_gl[TCPbufferindex], GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size,  cudaMemcpyHostToDevice);
      if ( cudaCopyStatus != cudaSuccess )
      {
    printf("cudaMemcpy of input data FAIL, status: %d error: %s bytes: %d\n", cudaCopyStatus, cudaGetErrorString(cudaCopyStatus), GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size);
    if ( cudaCopyStatus == cudaErrorInvalidValue  ) printf("cudaErrorInvalidValue !\n");
    if ( cudaCopyStatus == cudaErrorInvalidDevicePointer ) printf("cudaErrorInvalidDevicePointer!\n");
      }

#ifdef TIME_MEASURE_DEF
      cudaEventRecord(stop, 0);
      cudaEventSynchronize(stop);
      cudaEventElapsedTime(&elapsedTime, start, stop);
      dbprintf("%s(%d): copied data from CPU (pntr %p) to GPU (pntr %p), size %d, time %f ms\n",
      __func__, __LINE__, tcp_buf_gl[TCPbufferindex], gpu_idata, GPU_Data_Buffer[GPUbufferindex].gpu_data_raw_size, elapsedTime);
      cudaEventDestroy(start);
      cudaEventDestroy(stop);
#endif // TIME_MEASURE_DEF	
    } // end cuda copy from host to device (if Any_processing_on is true)

    // get GPU copy time for GPU thread
    gettimeofday( &tcopy, NULL);
    dbprintf("%s(%d): duration of start to copy, fdt = %e us \n", __func__, __LINE__, toddiff( &tstart, &tcopy) );
    trigger_info.time_gputhread_copytogpu_done_s = tcopy.tv_sec;
    trigger_info.time_gputhread_copytogpu_done_us = tcopy.tv_usec;

GPU Proc Done - GPU Copy Done

Cuda memcopy

#endif // USE_GPU 

    // unlocked the access to TCP buffer now all data is copied to GPU buffers
    pthread_mutex_unlock( &mutex_TCP_buf[TCPbufferindex]);
    dbprintf("%s(%d): unlocking ring buffer , buffer %d, fill %d\n",  __func__, __LINE__, TCPbufferindex, GPUfillnumber_local);

#ifdef USE_GPU  

    // for muon type fill and TQ processing switched on launch processing on GPU 
    if ( ModFillType==1 || ModFillType==2) {

      for (int itq = 0; itq < TQMETHOD_MAX; itq++){

    if ( tq_parameters_odb[itq].TQ_on || tq_parameters_odb[itq].store_hist ) {
      if(tq_parameters_odb[itq].fill_type != ModFillType) continue;

      cuda_g2_run_kernel( gpu_idata, gpu_odata, GPU_Data_Buffer[GPUbufferindex].gpu_data_proc[itq], itq , GPUbufferindex); // see kernel.cu for gpu proceesing functions

      // note that copy from device to host of processed data gpu_data_proc and setting of data size gpu_data_proc_size is done 
      // in  function cuda_g2_run_kernel() whereas the copying and zeroing of histogram data on pre-scaled fills is done here.
      //if (  tq_parameters_odb[itq].store_hist && !((AMC13fillcounter-1)%tq_parameters_odb[itq].flush_hist) ) 
      if (  tq_parameters_odb[itq].store_hist && ((GPUmuonfillnumber+1)%tq_parameters_odb[itq].flush_hist)==0 ) {

        // copy histogram data
        cudaCopyStatus = cudaMemcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_his[itq], gpu_odata+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_offset[itq], GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq], cudaMemcpyDeviceToHost); 
        if (cudaCopyStatus !=  cudaSuccess )
        {
          printf("cudaMemcpy of output data FAIL, status: %d error: %s bytes: %d\n",
          cudaCopyStatus, cudaGetErrorString(cudaCopyStatus), GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]);
          if ( cudaCopyStatus == cudaErrorInvalidValue  ) printf("cudaErrorInvalidValue !\n");
          if ( cudaCopyStatus == cudaErrorInvalidDevicePointer ) printf("cudaErrorInvalidDevicePointer!\n");
        }

        dbprintf("%s(%d): TQ=%i, gpu_odata %p, copying / zeroing hist databank [ size=%d, offset=%d], hist[0] 0x%08x, hist[N/8] 0x%08x, hist[N/4] 0x%08x, readout fill number %d, GPU fill number %d, GPU muon fill number %d\n",
        __func__, __LINE__, itq, (gpu_odata+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_offset[itq]), 
        GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq], GPU_Data_Buffer[GPUbufferindex].gpu_data_his_offset[itq], *(GPU_Data_Buffer[GPUbufferindex].gpu_data_his[itq]), *(GPU_Data_Buffer[GPUbufferindex].gpu_data_his[itq]+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]/8+1), *(GPU_Data_Buffer[GPUbufferindex].gpu_data_his[itq]+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]/4+1), AMC13fillcounter, GPUfillnumber_local, GPUmuonfillnumber );

        // zero histogram data
        cudaCopyStatus = cudaMemset( gpu_odata+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_offset[itq], 0, GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]); // size unuts are bytes
        if (cudaCopyStatus !=  cudaSuccess )
        {
          printf("cudaMemset of histo data FAIL, status: %d error: %s bytes: %d\n",
          cudaCopyStatus, cudaGetErrorString(cudaCopyStatus), GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]);
          if ( cudaCopyStatus == cudaErrorInvalidValue  ) printf("cudaErrorInvalidValue !\n");
          if ( cudaCopyStatus == cudaErrorInvalidDevicePointer ) printf("cudaErrorInvalidDevicePointer!\n");
        }


      } // end flush and zero of histogram data

    } // if TQ processing or histogram processing is switched on

      } // loop over index itq of TQ methods  

    } // if muon fill

#endif // USE_GPU 

    // unlocked the access to TCP buffer now all data is copied to GPU buffers
    pthread_mutex_unlock( &mutex_TCP_buf[TCPbufferindex]);
    dbprintf("%s(%d): unlocking ring buffer , buffer %d, fill %d\n",  __func__, __LINE__, TCPbufferindex, GPUfillnumber_local);

#ifdef USE_GPU  

    // for muon type fill and TQ processing switched on launch processing on GPU 
    if ( ModFillType==1 || ModFillType==2) {

      for (int itq = 0; itq < TQMETHOD_MAX; itq++){

    if ( tq_parameters_odb[itq].TQ_on || tq_parameters_odb[itq].store_hist ) {
      if(tq_parameters_odb[itq].fill_type != ModFillType) continue;

      cuda_g2_run_kernel( gpu_idata, gpu_odata, GPU_Data_Buffer[GPUbufferindex].gpu_data_proc[itq], itq , GPUbufferindex); // see kernel.cu for gpu proceesing functions

      // note that copy from device to host of processed data gpu_data_proc and setting of data size gpu_data_proc_size is done 
      // in  function cuda_g2_run_kernel() whereas the copying and zeroing of histogram data on pre-scaled fills is done here.
      //if (  tq_parameters_odb[itq].store_hist && !((AMC13fillcounter-1)%tq_parameters_odb[itq].flush_hist) ) 
      if (  tq_parameters_odb[itq].store_hist && ((GPUmuonfillnumber+1)%tq_parameters_odb[itq].flush_hist)==0 ) {

        // copy histogram data
        cudaCopyStatus = cudaMemcpy( GPU_Data_Buffer[GPUbufferindex].gpu_data_his[itq], gpu_odata+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_offset[itq], GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq], cudaMemcpyDeviceToHost); 
        if (cudaCopyStatus !=  cudaSuccess )
        {
          printf("cudaMemcpy of output data FAIL, status: %d error: %s bytes: %d\n",
          cudaCopyStatus, cudaGetErrorString(cudaCopyStatus), GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]);
          if ( cudaCopyStatus == cudaErrorInvalidValue  ) printf("cudaErrorInvalidValue !\n");
          if ( cudaCopyStatus == cudaErrorInvalidDevicePointer ) printf("cudaErrorInvalidDevicePointer!\n");
        }

        dbprintf("%s(%d): TQ=%i, gpu_odata %p, copying / zeroing hist databank [ size=%d, offset=%d], hist[0] 0x%08x, hist[N/8] 0x%08x, hist[N/4] 0x%08x, readout fill number %d, GPU fill number %d, GPU muon fill number %d\n",
        __func__, __LINE__, itq, (gpu_odata+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_offset[itq]), 
        GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq], GPU_Data_Buffer[GPUbufferindex].gpu_data_his_offset[itq], *(GPU_Data_Buffer[GPUbufferindex].gpu_data_his[itq]), *(GPU_Data_Buffer[GPUbufferindex].gpu_data_his[itq]+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]/8+1), *(GPU_Data_Buffer[GPUbufferindex].gpu_data_his[itq]+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]/4+1), AMC13fillcounter, GPUfillnumber_local, GPUmuonfillnumber );

        // zero histogram data
        cudaCopyStatus = cudaMemset( gpu_odata+GPU_Data_Buffer[GPUbufferindex].gpu_data_his_offset[itq], 0, GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]); // size unuts are bytes
        if (cudaCopyStatus !=  cudaSuccess )
        {
          printf("cudaMemset of histo data FAIL, status: %d error: %s bytes: %d\n",
          cudaCopyStatus, cudaGetErrorString(cudaCopyStatus), GPU_Data_Buffer[GPUbufferindex].gpu_data_his_size[itq]);
          if ( cudaCopyStatus == cudaErrorInvalidValue  ) printf("cudaErrorInvalidValue !\n");
          if ( cudaCopyStatus == cudaErrorInvalidDevicePointer ) printf("cudaErrorInvalidDevicePointer!\n");
        }


      } // end flush and zero of histogram data

    } // if TQ processing or histogram processing is switched on

      } // loop over index itq of TQ methods  

    } // if muon fill

MFE proc unlocked - GPU Proc done

Near beginning of read trigger event

INT read_trigger_event(char *pevent, INT off __attribute__((unused)))
{

  int status __attribute__((unused));
  float *fdata;
  BYTE *bdata;
  short *pdata;
  DWORD *hdata;
  char bk_name[8];
  int frontend_index = get_frontend_index();

  // temporary array for performance data to allowing unlocking gpu thread before data compression
  int perf_data_size = 0;
  uint64_t *perf_data;
  perf_data = (uint64_t*) malloc( gpu_data_header_size_max );

  dbprintf("Begin read_trigger_event!\n");

  //Obtain the address of the data struct in the GPU buffer
  int GPUbufferindex = Midasfillnumber % GPU_BUFFER_SIZE;
  GPU_Data_t* GPUDATA = &(GPU_Data_Buffer[GPUbufferindex]);

  //Lock the buffer access
  pthread_mutex_lock( &mutex_GPU_buf[GPUbufferindex] );
  // get AMC13 fill number
  unsigned int AMC13fillcounter = ( be32toh ( GPUDATA->gpu_data_header[0] ) & 0x00FFFFFF ); 

  // get GPU muon fill number that's stored by gpu_thread (used for flushing the CQ, CR banks)
  unsigned int GPUmuonfillcounter = GPUDATA->gpu_data_header[21];
  dbprintf("GPUmuonfillcounter %i\n", GPUmuonfillcounter);

  // get data ready time
  struct timeval t_lock_data, t_got_data;

  status = gettimeofday( &t_lock_data, NULL);
  trigger_info.time_slave_lock_dataready_s  = t_lock_data.tv_sec;
  trigger_info.time_slave_lock_dataready_us = t_lock_data.tv_usec;

INT read_trigger_event(char *pevent, INT off __attribute__((unused)))
{

  int status __attribute__((unused));
  float *fdata;
  BYTE *bdata;
  short *pdata;
  DWORD *hdata;
  char bk_name[8];
  int frontend_index = get_frontend_index();

  // temporary array for performance data to allowing unlocking gpu thread before data compression
  int perf_data_size = 0;
  uint64_t *perf_data;
  perf_data = (uint64_t*) malloc( gpu_data_header_size_max );

  dbprintf("Begin read_trigger_event!\n");

  //Obtain the address of the data struct in the GPU buffer
  int GPUbufferindex = Midasfillnumber % GPU_BUFFER_SIZE;
  GPU_Data_t* GPUDATA = &(GPU_Data_Buffer[GPUbufferindex]);

  //Lock the buffer access
  pthread_mutex_lock( &mutex_GPU_buf[GPUbufferindex] );
  // get AMC13 fill number
  unsigned int AMC13fillcounter = ( be32toh ( GPUDATA->gpu_data_header[0] ) & 0x00FFFFFF ); 

  // get GPU muon fill number that's stored by gpu_thread (used for flushing the CQ, CR banks)
  unsigned int GPUmuonfillcounter = GPUDATA->gpu_data_header[21];
  dbprintf("GPUmuonfillcounter %i\n", GPUmuonfillcounter);

  // get data ready time
  struct timeval t_lock_data, t_got_data;

  status = gettimeofday( &t_lock_data, NULL);
  trigger_info.time_slave_lock_dataready_s  = t_lock_data.tv_sec;
  trigger_info.time_slave_lock_dataready_us = t_lock_data.tv_usec;

MFE Banks Made - MFE Proc Unlocked
1. Bunch of bank construction (too many lines)

lossless compression - MFE Banks Made

lossless data compression, then deletes other banks if compression is on

  //This is for run3 and before
  /*
   GPUDATA->gpu_data_header[19] = TCPfillnumber;
   GPUDATA->gpu_data_header[20] = GPUfillnumber;
   GPUDATA->gpu_data_header[21] = GPUmuonfillcounter; // the muon fill counter as set for fill in gpu_thread
   */
  //In Run 4 nothing has to be done here
  //TODO Check Data!

  // fix size of header / timing data
  perf_data_size = 22*sizeof(GPUDATA->gpu_data_header[0]);
  // perf_data, perf_data_size are copies of GPUDATA->gpu_data_header, GPUDATA->gpu_data_header_size in order to release gpu lock before data compression
  memcpy( perf_data, GPUDATA->gpu_data_header, perf_data_size);



  // unlocking gpu thread access to GPU output buffer (commented out because causing problems)
  pthread_mutex_unlock( &mutex_GPU_buf[GPUbufferindex] );

  // for rider's make losslessly-compressed processed databank 


  dbprintf("%s(%d): lossless data compression %i\n", __func__, __LINE__, amc13_settings_odb.lossless_compression);
  if ( amc13_settings_odb.lossless_compression ){
    BANK_HEADER *bank_header = (BANK_HEADER *) pevent;
    dbprintf("%s(%d): fill FZ data bank, data size %lu\n",__func__, __LINE__, bank_header->data_size+sizeof(BANK_HEADER));
    if ( fe_compress_z(pevent, // char pointer to location of output
         (char*)bank_header, // char pointer to location of input
         bank_header->data_size+sizeof(BANK_HEADER), // data size + header size
         max_event_size-(bank_header->data_size+sizeof(BANK_HEADER)+sizeof(EVENT_HEADER)), // available space
    0) != FE_SUCCESS ){
      // compression failed. store raw dats
      printf("%s(%d): fill FZ data bank - compression failed\n",__func__, __LINE__);
    }

    // if losslessly compressing the midas banks then delete the uncompressed banks

#ifdef USE_GPU	
    for (int itq = 0; itq < TQMETHOD_MAX; itq++){

      if ( tq_parameters_odb[itq].TQ_on && Fill_type==1 ) {

        sprintf(bk_name,"%sS%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
        sprintf(bk_name,"%sP%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
        sprintf(bk_name,"%sT%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

      } // end delete CQ, CP, CT banks

      //if ( tq_parameters_odb[itq].store_hist && Fill_type==1 && !( (AMC13fillcounter-1-tq_parameters_odb[itq].flush_offset_hist) % tq_parameters_odb[itq].flush_hist ) )
      // flush offset is disabled for run4, and making sure that fill0 is not flushed.
      if ( tq_parameters_odb[itq].store_hist && Fill_type==1 && ( (GPUmuonfillcounter+1) % tq_parameters_odb[itq].flush_hist ) == 0) {

        sprintf(bk_name,"%sQ%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

      } // end delete CH bank

      if ( tq_parameters_odb[itq].fit_islands>0 && Fill_type==1 ){

        sprintf(bk_name,"%sF%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

      } // end delete CF bank

    } // end loop over TQ methods
    sprintf(bk_name,"CA000");
    bk_delete(pevent,bk_name);
    dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
    sprintf(bk_name,"CR000");
    bk_delete(pevent,bk_name);
    dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
    sprintf(bk_name,"CZ000");
    bk_delete(pevent,bk_name);
    dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

#endif // USE_GPU


    if (Fill_type>1 || ( amc13_settings_odb.store_raw && !( ( GPUmuonfillcounter - amc13_settings_odb.prescale_offset_raw ) % amc13_settings_odb.prescale_raw ) && AMC13fillcounter>=amc13_settings_odb.prescale_offset_raw)) {

      if (Fill_type == 0x1) {
        sprintf(bk_name,"CR%03i",frontend_index); // muon fill type
      }
      else if(Fill_type == 0x2) {
        sprintf(bk_name,"LR%03i",frontend_index); // laser fill type
      }
      else if(Fill_type == 0x3) {
        sprintf(bk_name,"PR%03i",frontend_index); // pedestal fill type
      }
      else if(Fill_type == 0x4) {
        sprintf(bk_name,"AR%03i",frontend_index); // async fill type
      }
      bk_delete(pevent,bk_name);
      dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

    } // end delete CR bank, etc

    // delete CA / LA / PA banks
    if (Fill_type == 0x1) {
      sprintf(bk_name,"CA%03i",frontend_index); // muon fill type
    }
    else if(Fill_type == 0x2) {
      sprintf(bk_name,"LA%03i",frontend_index); // laser fill type
    }
    else if(Fill_type == 0x3) {
      sprintf(bk_name,"PA%03i",frontend_index); // pedestal fill type
    }
    else if(Fill_type == 0x4) {
      sprintf(bk_name,"AA%03i",frontend_index); // async fill type
    }
    bk_delete(pevent,bk_name);
    dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

    // delete CB bank (there's no filling of "CB" bank equivalent for laser, ped, async fills
    if (Fill_type == 0x1) {
      sprintf(bk_name,"CB%03i",frontend_index); // muon fill type
      bk_delete(pevent,bk_name);
      dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
    }

      // delete CC / LC / PC / AC banks
      if (Fill_type == 0x1) {
    sprintf(bk_name,"CC%03i",frontend_index); // muon fill type
      }
      else if(Fill_type == 0x2) {
    sprintf(bk_name,"LC%03i",frontend_index); // laser fill type
      }
      else if(Fill_type == 0x3) {
    sprintf(bk_name,"PC%03i",frontend_index); // pedestal fill type
      }
      else if(Fill_type == 0x4) {
    sprintf(bk_name,"AC%03i",frontend_index); // async fill type
      }
      bk_delete(pevent,bk_name);
      dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

      // delete CZ / LZ / PZ / AZ banks
      if (Fill_type == 0x1) {
    sprintf(bk_name,"CZ%03i",frontend_index); // muon fill type
      }
      else if(Fill_type == 0x2) {
    sprintf(bk_name,"LZ%03i",frontend_index); // laser fill type
      }
      else if(Fill_type == 0x3) {
    sprintf(bk_name,"PZ%03i",frontend_index); // pedestal fill type
      }
      else if(Fill_type == 0x4) {
    sprintf(bk_name,"AZ%03i",frontend_index); // async fill type
      }
      bk_delete(pevent,bk_name);
      dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);


  } // end lossless compression

  //This is for run3 and before
  /*
   GPUDATA->gpu_data_header[19] = TCPfillnumber;
   GPUDATA->gpu_data_header[20] = GPUfillnumber;
   GPUDATA->gpu_data_header[21] = GPUmuonfillcounter; // the muon fill counter as set for fill in gpu_thread
   */
  //In Run 4 nothing has to be done here
  //TODO Check Data!

  // fix size of header / timing data
  perf_data_size = 22*sizeof(GPUDATA->gpu_data_header[0]);
  // perf_data, perf_data_size are copies of GPUDATA->gpu_data_header, GPUDATA->gpu_data_header_size in order to release gpu lock before data compression
  memcpy( perf_data, GPUDATA->gpu_data_header, perf_data_size);



  // unlocking gpu thread access to GPU output buffer (commented out because causing problems)
  pthread_mutex_unlock( &mutex_GPU_buf[GPUbufferindex] );

  // for rider's make losslessly-compressed processed databank 


  dbprintf("%s(%d): lossless data compression %i\n", __func__, __LINE__, amc13_settings_odb.lossless_compression);
  if ( amc13_settings_odb.lossless_compression ){
    BANK_HEADER *bank_header = (BANK_HEADER *) pevent;
    dbprintf("%s(%d): fill FZ data bank, data size %lu\n",__func__, __LINE__, bank_header->data_size+sizeof(BANK_HEADER));
    if ( fe_compress_z(pevent, // char pointer to location of output
         (char*)bank_header, // char pointer to location of input
         bank_header->data_size+sizeof(BANK_HEADER), // data size + header size
         max_event_size-(bank_header->data_size+sizeof(BANK_HEADER)+sizeof(EVENT_HEADER)), // available space
    0) != FE_SUCCESS ){
      // compression failed. store raw dats
      printf("%s(%d): fill FZ data bank - compression failed\n",__func__, __LINE__);
    }

    // if losslessly compressing the midas banks then delete the uncompressed banks

#ifdef USE_GPU	
    for (int itq = 0; itq < TQMETHOD_MAX; itq++){

      if ( tq_parameters_odb[itq].TQ_on && Fill_type==1 ) {

        sprintf(bk_name,"%sS%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
        sprintf(bk_name,"%sP%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
        sprintf(bk_name,"%sT%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

      } // end delete CQ, CP, CT banks

      //if ( tq_parameters_odb[itq].store_hist && Fill_type==1 && !( (AMC13fillcounter-1-tq_parameters_odb[itq].flush_offset_hist) % tq_parameters_odb[itq].flush_hist ) )
      // flush offset is disabled for run4, and making sure that fill0 is not flushed.
      if ( tq_parameters_odb[itq].store_hist && Fill_type==1 && ( (GPUmuonfillcounter+1) % tq_parameters_odb[itq].flush_hist ) == 0) {

        sprintf(bk_name,"%sQ%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

      } // end delete CH bank

      if ( tq_parameters_odb[itq].fit_islands>0 && Fill_type==1 ){

        sprintf(bk_name,"%sF%03i", tq_parameters_odb[itq].TQ_bankprefix, frontend_index);
        bk_delete(pevent,bk_name);
        dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

      } // end delete CF bank

    } // end loop over TQ methods
    sprintf(bk_name,"CA000");
    bk_delete(pevent,bk_name);
    dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
    sprintf(bk_name,"CR000");
    bk_delete(pevent,bk_name);
    dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
    sprintf(bk_name,"CZ000");
    bk_delete(pevent,bk_name);
    dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

#endif // USE_GPU


    if (Fill_type>1 || ( amc13_settings_odb.store_raw && !( ( GPUmuonfillcounter - amc13_settings_odb.prescale_offset_raw ) % amc13_settings_odb.prescale_raw ) && AMC13fillcounter>=amc13_settings_odb.prescale_offset_raw)) {

      if (Fill_type == 0x1) {
        sprintf(bk_name,"CR%03i",frontend_index); // muon fill type
      }
      else if(Fill_type == 0x2) {
        sprintf(bk_name,"LR%03i",frontend_index); // laser fill type
      }
      else if(Fill_type == 0x3) {
        sprintf(bk_name,"PR%03i",frontend_index); // pedestal fill type
      }
      else if(Fill_type == 0x4) {
        sprintf(bk_name,"AR%03i",frontend_index); // async fill type
      }
      bk_delete(pevent,bk_name);
      dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

    } // end delete CR bank, etc

    // delete CA / LA / PA banks
    if (Fill_type == 0x1) {
      sprintf(bk_name,"CA%03i",frontend_index); // muon fill type
    }
    else if(Fill_type == 0x2) {
      sprintf(bk_name,"LA%03i",frontend_index); // laser fill type
    }
    else if(Fill_type == 0x3) {
      sprintf(bk_name,"PA%03i",frontend_index); // pedestal fill type
    }
    else if(Fill_type == 0x4) {
      sprintf(bk_name,"AA%03i",frontend_index); // async fill type
    }
    bk_delete(pevent,bk_name);
    dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

    // delete CB bank (there's no filling of "CB" bank equivalent for laser, ped, async fills
    if (Fill_type == 0x1) {
      sprintf(bk_name,"CB%03i",frontend_index); // muon fill type
      bk_delete(pevent,bk_name);
      dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);
    }

      // delete CC / LC / PC / AC banks
      if (Fill_type == 0x1) {
    sprintf(bk_name,"CC%03i",frontend_index); // muon fill type
      }
      else if(Fill_type == 0x2) {
    sprintf(bk_name,"LC%03i",frontend_index); // laser fill type
      }
      else if(Fill_type == 0x3) {
    sprintf(bk_name,"PC%03i",frontend_index); // pedestal fill type
      }
      else if(Fill_type == 0x4) {
    sprintf(bk_name,"AC%03i",frontend_index); // async fill type
      }
      bk_delete(pevent,bk_name);
      dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);

      // delete CZ / LZ / PZ / AZ banks
      if (Fill_type == 0x1) {
    sprintf(bk_name,"CZ%03i",frontend_index); // muon fill type
      }
      else if(Fill_type == 0x2) {
    sprintf(bk_name,"LZ%03i",frontend_index); // laser fill type
      }
      else if(Fill_type == 0x3) {
    sprintf(bk_name,"PZ%03i",frontend_index); // pedestal fill type
      }
      else if(Fill_type == 0x4) {
    sprintf(bk_name,"AZ%03i",frontend_index); // async fill type
      }
      bk_delete(pevent,bk_name);
      dbprintf("%s(%d): deleted bank %s\n", __func__, __LINE__, bk_name);


  } // end lossless compression

14/05/2024 12:19

Scanning the 192.168.xxx.xxx subet with nmap 192.168.0.0/16 shows me the VadaTech MCH is here(?):

192.168.20.14 <-- Network Research
192.168.20.230 <-- Network Research
192.168.60.15 <-- VadaTech
192.168.60.17 <-- VadaTech
192.168.60.18 <-- VadaTech
192.168.60.19 <-- VadaTech

192.168.20.14 <-- Network Research
192.168.20.230 <-- Network Research
192.168.60.15 <-- VadaTech
192.168.60.17 <-- VadaTech
192.168.60.18 <-- VadaTech
192.168.60.19 <-- VadaTech

For some reason I can't scroll up in the terminal, so I'm a little upset by that, but this is all I can see. Apparently CentOS7 clears the terminal buffer after some time.

In any event, pinging any of the VadaTech modules hangs. The destination is reachable from the 'be' computer's perspective, but there is no response from the modules. I can, however, ping T1 and T2 located at 192.168.20.13 and 192.168.20.14.

14/05/2024 13:11

For our temporary setup using WiFi, the 'be' computer can be connected to by ssh tunneling. See intructions below:

Remotely connect to newest desktop
Port forward connections for midas, crate monitor, data monitor:

ssh -L 8080:localhost:8080 -L 8000:localhost:8000 -L 7000:localhost:7000 pioneer@10.47.95.44

ssh -L 8080:localhost:8080 -L 8000:localhost:8000 -L 7000:localhost:7000 pioneer@10.47.95.44

Remotely connect to 'be'
Port forward connections for midas, crate monitor, data monitor:

ssh -L 8080:localhost:8080 -L 8000:localhost:8000 -L 7000:localhost:7000 root@10.0.0.3

ssh -L 8080:localhost:8080 -L 8000:localhost:8000 -L 7000:localhost:7000 root@10.0.0.3

Passwords for both are mu->egamma

14/05/2024 13:19

I tested swapping out the MCHs (put N.A.T. MCH in our second crate). I was able to ping the N.A.T. with ping 192.168.1.41

[root@localhost ~]# ping 192.168.1.41
PING 192.168.1.41 (192.168.1.41) 56(84) bytes of data.
64 bytes from 192.168.1.41: icmp_seq=1 ttl=255 time=0.313 ms
64 bytes from 192.168.1.41: icmp_seq=2 ttl=255 time=0.355 ms

[root@localhost ~]# ping 192.168.1.41
PING 192.168.1.41 (192.168.1.41) 56(84) bytes of data.
64 bytes from 192.168.1.41: icmp_seq=1 ttl=255 time=0.313 ms
64 bytes from 192.168.1.41: icmp_seq=2 ttl=255 time=0.355 ms

14/05/2024 13:37

Here are the untrucated results of nmap 192.168.0.0/16:

[root@localhost output_files]# cat nmap_output2.txt

Starting Nmap 6.40 ( http://nmap.org ) at 2024-05-14 12:49 EDT
Nmap scan report for 192.168.1.1
Host is up (0.00011s latency).
All 1000 scanned ports on 192.168.1.1 are filtered
MAC Address: 00:60:55:00:01:DF (Cornell University)

Nmap scan report for 192.168.4.3
Host is up (-0.034s latency).
All 1000 scanned ports on 192.168.4.3 are filtered
MAC Address: 00:60:55:00:01:BC (Cornell University)

Nmap scan report for 192.168.1.100
Host is up (0.000031s latency).
Not shown: 997 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
111/tcp  open  rpcbind
2049/tcp open  nfs

Nmap scan report for 192.168.20.13
Host is up (-0.10s latency).
All 1000 scanned ports on 192.168.20.13 are filtered
MAC Address: 08:00:30:F3:04:33 (Network Research)

Nmap scan report for 192.168.20.14
Host is up (0.00011s latency).
All 1000 scanned ports on 192.168.20.14 are filtered
MAC Address: 08:00:30:F3:04:73 (Network Research)

Nmap scan report for 192.168.40.230
Host is up (0.0015s latency).
All 1000 scanned ports on 192.168.40.230 are filtered
MAC Address: 00:13:3A:0A:21:72 (VadaTech)

Nmap scan report for 192.168.60.15
Host is up (0.00092s latency).
All 1000 scanned ports on 192.168.60.15 are filtered
MAC Address: 00:13:3A:0A:14:B9 (VadaTech)

Nmap scan report for 192.168.60.17
Host is up (0.0012s latency).
All 1000 scanned ports on 192.168.60.17 are filtered
MAC Address: 00:13:3A:0A:14:B9 (VadaTech)

Nmap scan report for 192.168.60.18
Host is up (0.0012s latency).
All 1000 scanned ports on 192.168.60.18 are filtered
MAC Address: 00:13:3A:0A:14:B9 (VadaTech)

Nmap scan report for 192.168.60.19
Host is up (0.0011s latency).
All 1000 scanned ports on 192.168.60.19 are filtered
MAC Address: 00:13:3A:0A:14:B9 (VadaTech)

Nmap scan report for 192.168.122.1
Host is up (0.000031s latency).
Not shown: 996 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
53/tcp   open  domain
111/tcp  open  rpcbind
2049/tcp open  nfs

Nmap done: 47872 IP addresses (11 hosts up) scanned in 1743.13 seconds
[root@localhost output_files]#

[root@localhost output_files]# cat nmap_output2.txt

Starting Nmap 6.40 ( http://nmap.org ) at 2024-05-14 12:49 EDT
Nmap scan report for 192.168.1.1
Host is up (0.00011s latency).
All 1000 scanned ports on 192.168.1.1 are filtered
MAC Address: 00:60:55:00:01:DF (Cornell University)

Nmap scan report for 192.168.4.3
Host is up (-0.034s latency).
All 1000 scanned ports on 192.168.4.3 are filtered
MAC Address: 00:60:55:00:01:BC (Cornell University)

Nmap scan report for 192.168.1.100
Host is up (0.000031s latency).
Not shown: 997 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
111/tcp  open  rpcbind
2049/tcp open  nfs

Nmap scan report for 192.168.20.13
Host is up (-0.10s latency).
All 1000 scanned ports on 192.168.20.13 are filtered
MAC Address: 08:00:30:F3:04:33 (Network Research)

Nmap scan report for 192.168.20.14
Host is up (0.00011s latency).
All 1000 scanned ports on 192.168.20.14 are filtered
MAC Address: 08:00:30:F3:04:73 (Network Research)

Nmap scan report for 192.168.40.230
Host is up (0.0015s latency).
All 1000 scanned ports on 192.168.40.230 are filtered
MAC Address: 00:13:3A:0A:21:72 (VadaTech)

Nmap scan report for 192.168.60.15
Host is up (0.00092s latency).
All 1000 scanned ports on 192.168.60.15 are filtered
MAC Address: 00:13:3A:0A:14:B9 (VadaTech)

Nmap scan report for 192.168.60.17
Host is up (0.0012s latency).
All 1000 scanned ports on 192.168.60.17 are filtered
MAC Address: 00:13:3A:0A:14:B9 (VadaTech)

Nmap scan report for 192.168.60.18
Host is up (0.0012s latency).
All 1000 scanned ports on 192.168.60.18 are filtered
MAC Address: 00:13:3A:0A:14:B9 (VadaTech)

Nmap scan report for 192.168.60.19
Host is up (0.0011s latency).
All 1000 scanned ports on 192.168.60.19 are filtered
MAC Address: 00:13:3A:0A:14:B9 (VadaTech)

Nmap scan report for 192.168.122.1
Host is up (0.000031s latency).
Not shown: 996 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
53/tcp   open  domain
111/tcp  open  rpcbind
2049/tcp open  nfs

Nmap done: 47872 IP addresses (11 hosts up) scanned in 1743.13 seconds
[root@localhost output_files]#

14/05/2024 15:16

I tried to connect to Vadatech MCH webpage on the newest desktop with it connected to 'be' which is connected to the VadaTech MCH:

ssh -L 8081:192.168.40.230:80 root@10.0.0.3

ssh -L 8081:192.168.40.230:80 root@10.0.0.3

Then going to localhost:8081 should bring up the webpage, but it just hangs. It's unclear to me what the port 80 does at the end of the -L flag parameter (though this worked for the N.A.T. MCH).

I then tried to remove 'be' as the middle man. I reconfigured the network settings on the newest desktop so it is on the 192.168.xxx.xxx network and directly connected it to the MCH GbE0 port. I was able to ping T1, T2, and the 2 WFD5s in the crate, so the connection "worked." But I have all the same problems as before (can't see webpage by going to http://192.168.40.230/, can't ping MCH, etc.)

14/05/2024 16:49

I was able to get into the MCH by setting the computer's IP to 192.168.60.xxx.

#
# Connect to MCH
#
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.60.100
NETMASK=255.255.0.0
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp5s0
DEVICE=enp5s0
ONBOOT=yes

#
# Connect to MCH
#
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.60.100
NETMASK=255.255.0.0
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp5s0
DEVICE=enp5s0
ONBOOT=yes

It turns out our MCH was configured to ignore traffic outside of the 255.255.254.0 = 192.168.60.xxx subnet, as you can see when doing vim /etc/rc.d/rc.conf

# net interface 0
export SYSCFG_IFACE0=n
export INTERFACE0="eth0"
export IPADDR0="0.0.0.0"
export NETMASK0="0.0.0.0"
export BROADCAST0="0.0.0.0"
export GATEWAY0="0.0.0.0"
export NAMESERVER0="0.0.0.0"
# net interface 1
export SYSCFG_IFACE1=y
export INTERFACE1="eth1"
export IPADDR1="192.168.60.15"
export NETMASK1="255.255.254.0"
export BROADCAST1="192.168.61.255"
export GATEWAY1="192.168.60.121"
export NAMESERVER1="0.0.0.0"

# net interface 0
export SYSCFG_IFACE0=n
export INTERFACE0="eth0"
export IPADDR0="0.0.0.0"
export NETMASK0="0.0.0.0"
export BROADCAST0="0.0.0.0"
export GATEWAY0="0.0.0.0"
export NAMESERVER0="0.0.0.0"
# net interface 1
export SYSCFG_IFACE1=y
export INTERFACE1="eth1"
export IPADDR1="192.168.60.15"
export NETMASK1="255.255.254.0"
export BROADCAST1="192.168.61.255"
export GATEWAY1="192.168.60.121"
export NAMESERVER1="0.0.0.0"

To edit files, you have to run:

mount -o remount,rw /

mount -o remount,rw /

Also, for some reason to use vim on the MCH you have to run:

:set nocompatible

:set nocompatible

first.

14/05/2024 16:55

I was able to edit the MCH to change it's crate number to "2" by following the steps in this pdf.
mch_network_configuration.pdf

14/05/2024 17:36

Before I solved the problem with the 10GbE link by putting the 10GbE AMC port on a different subnet (192.168.1.150). However, this will no longer work with two crates. In short, it's against subnetting rules to have a subnet like 192.168.{1 or 2}.{1 to 128} or anything similar. Basically, you can't have control over the 3rd and 4th octet simulatenously.

As a result, I need to find a way to get the 10GbE link on a different subnet (i.e 192.168.50.xxx, or something like that). I failed to do this before, and I'm unsure why it didn't work. I didn't test too thoroughly

27/03/2024 20:38

I am trying to change the IP to 192.168.10.1

Pick an action (h for menu): wv 0x1c1c 0xc0a80a01
Writing to T1:
00001c1c: c0a80a01

Pick an action (h for menu): rv 0x1c1c
Reading T1:
00001c1c: c0a80a01
since

192 = c0
168 = a8
10 = 0a
1 = 01
I then changed enp1s0f1 to be on the 192.168.10.xxx subnet with IP 192.168.1.2. It didn't really seem to work:

[root@dhcp-10-163-105-238 amc13StandaloneMAN_2014-05-12]# ping 192.168.10.1
PING 192.168.10.1 (192.168.10.1) 56(84) bytes of data.
^C
--- 192.168.10.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms

[root@dhcp-10-163-105-238 amc13StandaloneMAN_2014-05-12]# ifdown enp5s0
Device 'enp5s0' successfully disconnected.
[root@dhcp-10-163-105-238 amc13StandaloneMAN_2014-05-12]# ping 192.168.10.1
PING 192.168.10.1 (192.168.10.1) 56(84) bytes of data.
^C
--- 192.168.10.1 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

[root@dhcp-10-163-105-238 amc13StandaloneMAN_2014-05-12]#
I can successfully change the IP to something else on the 192.168.1.xxx subnet though

15/05/2024 12:18

I don't know what I was doing wrong last time. I got the 10GbE links to work rather trivially this time.

First I change the network settings scripts for the 10GbE ports on 'be'. I set them to be on the networks 192.168.50.xxx and 192.168.51.xxx. enp1s0f1 is connected to crate 1, and enp1s0f0 is connected to crate 2, so we have to set the 10GbE port addresses to be on the respective network.
/etc/sysconfig/network-scripts/ifcfg-enp1s0f0:

#
# Connect to AMC
#
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.51.100
NETMASK=255.255.255.0
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp1s0f0
DEVICE=enp1s0f0
ONBOOT=yes
AUTOCONNECT_PRIORITY=-999
MTU=9000

#
# Connect to AMC
#
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.51.100
NETMASK=255.255.255.0
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp1s0f0
DEVICE=enp1s0f0
ONBOOT=yes
AUTOCONNECT_PRIORITY=-999
MTU=9000

/etc/sysconfig/network-scripts/ifcfg-enp1s0f1:

#
# Connect to AMC
#
HWADDR=b4:b5:2f:a4:e7:fc
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
IPADDR=192.168.50.100
NETMASK=255.255.255.0
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
UUID=f1d52da3-687b-3215-a2c0-60c11d0fd3bf
ONBOOT=yes
AUTOCONNECT_PRIORITY=-999
MTU=9000
DEVICE=enp1s0f1
NAME=enp1s0f1

#
# Connect to AMC
#
HWADDR=b4:b5:2f:a4:e7:fc
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
IPADDR=192.168.50.100
NETMASK=255.255.255.0
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
UUID=f1d52da3-687b-3215-a2c0-60c11d0fd3bf
ONBOOT=yes
AUTOCONNECT_PRIORITY=-999
MTU=9000
DEVICE=enp1s0f1
NAME=enp1s0f1

For crate 1:
Follow these steps
Set T1 and T2 IPs again
Ensure the correct IP and network base in systemVars.py, should look like this:

#File to specify what the default varaibles addresses are used in your system

#Default IP address for commercial MCH module
# our NAT MCH address
DEFAULT_HOST_IP="192.168.1.41"
# our Vadatech MCH address
#DEFAULT_HOST_IP="192.168.2.15"

#Default AMC13 slot number
DEFAULT_AMC13_SLOT=13

#Location of 'config_tools'. This should never need to be changed
DEFAULT_CONFIG_DIR="./config_tools"

#Network base for your uTCA crate's AMC modules
NETWORK_BASE="192.168.1"
#NETWORK_BASE="192.168.2"

#File to specify what the default varaibles addresses are used in your system

#Default IP address for commercial MCH module
# our NAT MCH address
DEFAULT_HOST_IP="192.168.1.41"
# our Vadatech MCH address
#DEFAULT_HOST_IP="192.168.2.15"

#Default AMC13 slot number
DEFAULT_AMC13_SLOT=13

#Location of 'config_tools'. This should never need to be changed
DEFAULT_CONFIG_DIR="./config_tools"

#Network base for your uTCA crate's AMC modules
NETWORK_BASE="192.168.1"
#NETWORK_BASE="192.168.2"

Now set the IPs:

cd /home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/dev_tools/amc13Config
./applyConfig.py -i 192.168.1.13
The -i flag defines the T1 and T2 IP. T1 will be the argument of -i and T2 will be that argument +1 in the octet of the IP.

cd /home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/dev_tools/amc13Config
./applyConfig.py -i 192.168.1.13
The -i flag defines the T1 and T2 IP. T1 will be the argument of -i and T2 will be that argument +1 in the octet of the IP.

Test pinging T1 and T2:

ping 192.168.1.13
ping 192.168.1.14

ping 192.168.1.13
ping 192.168.1.14

Configure 10GbE link

cd /home/installation_testing/packages/experiment/lxedaq/amc13/amc13StandaloneMAN_2014-05-12
bin/AMC13Tool -i 192.168.1.13

cd /home/installation_testing/packages/experiment/lxedaq/amc13/amc13StandaloneMAN_2014-05-12
bin/AMC13Tool -i 192.168.1.13

Within AMC13Tool:

Enable DAQ Link:

Pick an action (h for menu): i 0-11 d
Enabling AMC inputs from list: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Link status: 0fff0fff
Enable DAQ Link
'CONTROL1': 813f0003

Pick an action (h for menu): i 0-11 d
Enabling AMC inputs from list: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Link status: 0fff0fff
Enable DAQ Link
'CONTROL1': 813f0003

Note: This has parameters:

i <ena_list> (d) (f)    enable AMCs from input list. Enable  (d)AQlink, (f)ake data,
         (t) (l)    use local (T)TC signal, enable (L)ocal triggers,
         (r) (b)    TTC(r)x, monBuf (b)ackpressure

i <ena_list> (d) (f)    enable AMCs from input list. Enable  (d)AQlink, (f)ake data,
         (t) (l)    use local (T)TC signal, enable (L)ocal triggers,
         (r) (b)    TTC(r)x, monBuf (b)ackpressure

I'm not sure which one to use.

Enable SFP+ Ports:

Pick an action (h for menu): wv 0x3 0x1fff
Writing to T1:
  00000003: 00001fff

Pick an action (h for menu): wv 0x3 0x1fff
Writing to T1:
  00000003: 00001fff

Change SFP+ port IP address to 192.168.50.1:

Pick an action (h for menu): wv 0x1c1c 0xC0A83201
Writing to T1:
  00001c1c: c0a83201

Pick an action (h for menu): wv 0x1c1c 0xC0A83201
Writing to T1:
  00001c1c: c0a83201

Now cycle the 10GbE port on 'be':

ifdown enp1s0f1
ifup enp1s0f1

ifdown enp1s0f1
ifup enp1s0f1

Try pinging

ping 192.168.50.1

ping 192.168.50.1

Also ensure you can still ping the MCHs:

ping 192.168.1.41
ping 192.168.2.15

ping 192.168.1.41
ping 192.168.2.15

(sometimes enp1s0f1 will start stealing traffic from enp5s0. To fix this ifdown enp5s0, ifdown enp1s0f1, ifup enp5s0, ifup enp1s0f1 in that order and retry pinging).

For crate 2:
Follow these steps
Set T1 and T2 IPs
Ensure the correct IP and network base in systemVars.py, should look like this:

#File to specify what the default varaibles addresses are used in your system

#Default IP address for commercial MCH module
# our NAT MCH address
#DEFAULT_HOST_IP="192.168.1.41"
# our Vadatech MCH address
DEFAULT_HOST_IP="192.168.2.15"

#Default AMC13 slot number
DEFAULT_AMC13_SLOT=13

#Location of 'config_tools'. This should never need to be changed
DEFAULT_CONFIG_DIR="./config_tools"

#Network base for your uTCA crate's AMC modules
#NETWORK_BASE="192.168.1"
NETWORK_BASE="192.168.2"

#File to specify what the default varaibles addresses are used in your system

#Default IP address for commercial MCH module
# our NAT MCH address
#DEFAULT_HOST_IP="192.168.1.41"
# our Vadatech MCH address
DEFAULT_HOST_IP="192.168.2.15"

#Default AMC13 slot number
DEFAULT_AMC13_SLOT=13

#Location of 'config_tools'. This should never need to be changed
DEFAULT_CONFIG_DIR="./config_tools"

#Network base for your uTCA crate's AMC modules
#NETWORK_BASE="192.168.1"
NETWORK_BASE="192.168.2"